New Upstream Snapshot - ruby-csv
Ready changes
Summary
Merged new upstream version: 3.2.6+git20230128.1.e5622c5 (was: 3.2.2).
Resulting package
Built on 2023-02-09T16:58 (took 10m12s)
The resulting binary packages can be installed (if you have the apt repository enabled) by running one of:
apt install -t fresh-snapshots ruby-csv
Diff
diff --git a/Gemfile b/Gemfile
new file mode 100644
index 0000000..c743311
--- /dev/null
+++ b/Gemfile
@@ -0,0 +1,4 @@
+source 'https://rubygems.org'
+
+# Specify your gem's dependencies in csv.gemspec
+gemspec
diff --git a/NEWS.md b/NEWS.md
index 51eb456..05f2419 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,5 +1,144 @@
# News
+## 3.2.6 - 2022-12-08
+
+### Improvements
+
+ * `CSV#read` consumes the same lines with other methods like
+ `CSV#shift`.
+ [[GitHub#258](https://github.com/ruby/csv/issues/258)]
+ [Reported by Lhoussaine Ghallou]
+
+ * All `Enumerable` based methods consume the same lines with other
+ methods. This may have a performance penalty.
+ [[GitHub#260](https://github.com/ruby/csv/issues/260)]
+ [Reported by Lhoussaine Ghallou]
+
+ * Simplify some implementations.
+ [[GitHub#262](https://github.com/ruby/csv/pull/262)]
+ [[GitHub#263](https://github.com/ruby/csv/pull/263)]
+ [Patch by Mau Magnaguagno]
+
+### Fixes
+
+ * Fixed `CSV.generate_lines` document.
+ [[GitHub#257](https://github.com/ruby/csv/pull/257)]
+ [Patch by Sampat Badhe]
+
+### Thanks
+
+ * Sampat Badhe
+
+ * Lhoussaine Ghallou
+
+ * Mau Magnaguagno
+
+## 3.2.5 - 2022-08-26
+
+### Improvements
+
+ * Added `CSV.generate_lines`.
+ [[GitHub#255](https://github.com/ruby/csv/issues/255)]
+ [Reported by OKURA Masafumi]
+ [[GitHub#256](https://github.com/ruby/csv/pull/256)]
+ [Patch by Eriko Sugiyama]
+
+### Thanks
+
+ * OKURA Masafumi
+
+ * Eriko Sugiyama
+
+## 3.2.4 - 2022-08-22
+
+### Improvements
+
+ * Cleaned up internal implementations.
+ [[GitHub#249](https://github.com/ruby/csv/pull/249)]
+ [[GitHub#250](https://github.com/ruby/csv/pull/250)]
+ [[GitHub#251](https://github.com/ruby/csv/pull/251)]
+ [Patch by Mau Magnaguagno]
+
+ * Added support for RFC 3339 style time.
+ [[GitHub#248](https://github.com/ruby/csv/pull/248)]
+ [Patch by Thierry Lambert]
+
+ * Added support for transcoding String CSV. Syntax is
+ `from-encoding:to-encoding`.
+ [[GitHub#254](https://github.com/ruby/csv/issues/254)]
+ [Reported by Richard Stueven]
+
+ * Added quoted information to `CSV::FieldInfo`.
+ [[GitHub#254](https://github.com/ruby/csv/pull/253)]
+ [Reported by Hirokazu SUZUKI]
+
+### Fixes
+
+ * Fixed a link in documents.
+ [[GitHub#244](https://github.com/ruby/csv/pull/244)]
+ [Patch by Peter Zhu]
+
+### Thanks
+
+ * Peter Zhu
+
+ * Mau Magnaguagno
+
+ * Thierry Lambert
+
+ * Richard Stueven
+
+ * Hirokazu SUZUKI
+
+## 3.2.3 - 2022-04-09
+
+### Improvements
+
+ * Added contents summary to `CSV::Table#inspect`.
+ [GitHub#229][Patch by Eriko Sugiyama]
+ [GitHub#235][Patch by Sampat Badhe]
+
+ * Suppressed `$INPUT_RECORD_SEPARATOR` deprecation warning by
+ `Warning.warn`.
+ [GitHub#233][Reported by Jean byroot Boussier]
+
+ * Improved error message for liberal parsing with quoted values.
+ [GitHub#231][Patch by Nikolay Rys]
+
+ * Fixed typos in documentation.
+ [GitHub#236][Patch by Sampat Badhe]
+
+ * Added `:max_field_size` option and deprecated `:field_size_limit` option.
+ [GitHub#238][Reported by Dan Buettner]
+
+ * Added `:symbol_raw` to built-in header converters.
+ [GitHub#237][Reported by taki]
+ [GitHub#239][Patch by Eriko Sugiyama]
+
+### Fixes
+
+ * Fixed a bug that some texts may be dropped unexpectedly.
+ [Bug #18245][ruby-core:105587][Reported by Hassan Abdul Rehman]
+
+ * Fixed a bug that `:field_size_limit` doesn't work with not complex row.
+ [GitHub#238][Reported by Dan Buettner]
+
+### Thanks
+
+ * Hassan Abdul Rehman
+
+ * Eriko Sugiyama
+
+ * Jean byroot Boussier
+
+ * Nikolay Rys
+
+ * Sampat Badhe
+
+ * Dan Buettner
+
+ * taki
+
## 3.2.2 - 2021-12-24
### Improvements
@@ -15,9 +154,6 @@
* Fixed a bug that all of `ARGF` contents may not be consumed.
[GitHub#228][Reported by Rafael Navaza]
- * Fixed a bug that some texts may be dropped unexpectedly.
- [Bug #18245][ruby-core:105587][Reported by Hassan Abdul Rehman]
-
### Thanks
* adamroyjones
@@ -26,8 +162,6 @@
* Rafael Navaza
- * Hassan Abdul Rehman
-
## 3.2.1 - 2021-10-23
### Improvements
diff --git a/Rakefile b/Rakefile
new file mode 100644
index 0000000..d23b604
--- /dev/null
+++ b/Rakefile
@@ -0,0 +1,68 @@
+require "rbconfig"
+require "rdoc/task"
+
+require "bundler/gem_tasks"
+
+spec = Bundler::GemHelper.gemspec
+
+desc "Run test"
+task :test do
+ ruby("run-test.rb")
+end
+
+task :default => :test
+
+namespace :warning do
+ desc "Treat warning as error"
+ task :error do
+ def Warning.warn(*message)
+ super
+ raise "Treat warning as error:\n" + message.join("\n")
+ end
+ end
+end
+
+RDoc::Task.new do |rdoc|
+ rdoc.options = spec.rdoc_options
+ rdoc.rdoc_files.include(*spec.source_paths)
+ rdoc.rdoc_files.include(*spec.extra_rdoc_files)
+end
+
+benchmark_tasks = []
+namespace :benchmark do
+ Dir.glob("benchmark/*.yaml").sort.each do |yaml|
+ name = File.basename(yaml, ".*")
+ env = {
+ "RUBYLIB" => nil,
+ "BUNDLER_ORIG_RUBYLIB" => nil,
+ }
+ command_line = [
+ RbConfig.ruby, "-v", "-S", "benchmark-driver", File.expand_path(yaml),
+ ]
+
+ desc "Run #{name} benchmark"
+ task name do
+ puts("```")
+ sh(env, *command_line)
+ puts("```")
+ end
+ benchmark_tasks << "benchmark:#{name}"
+
+ case name
+ when /\Aparse/, "shift"
+ namespace name do
+ desc "Run #{name} benchmark: small"
+ task :small do
+ puts("```")
+ sh(env.merge("N_COLUMNS" => "10"),
+ *command_line)
+ puts("```")
+ end
+ benchmark_tasks << "benchmark:#{name}:small"
+ end
+ end
+ end
+end
+
+desc "Run all benchmarks"
+task :benchmark => benchmark_tasks
diff --git a/benchmark/convert_nil.yaml b/benchmark/convert_nil.yaml
new file mode 100644
index 0000000..f32c6f1
--- /dev/null
+++ b/benchmark/convert_nil.yaml
@@ -0,0 +1,23 @@
+loop_count: 100
+contexts:
+ - gems:
+ csv: 3.0.1
+ - gems:
+ csv: 3.0.2
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ csv_text = <<CSV
+ foo,bar,,baz
+ hoge,,temo,
+ roo,goo,por,kosh
+ CSV
+ convert_nil = ->(s) {s || ""}
+benchmark:
+ 'not convert': CSV.parse(csv_text)
+ converter: |-
+ CSV.parse(csv_text, converters: convert_nil)
+ option: |-
+ CSV.parse(csv_text, nil_value: "")
diff --git a/benchmark/parse.yaml b/benchmark/parse.yaml
new file mode 100644
index 0000000..25ccaf2
--- /dev/null
+++ b/benchmark/parse.yaml
@@ -0,0 +1,30 @@
+loop_count: 100
+contexts:
+ - gems:
+ csv: 3.0.1
+ - gems:
+ csv: 3.0.2
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ n_columns = Integer(ENV.fetch("N_COLUMNS", "50"), 10)
+ n_rows = Integer(ENV.fetch("N_ROWS", "1000"), 10)
+ alphas = ["AAAAA"] * n_columns
+ unquoted = (alphas.join(",") + "\r\n") * n_rows
+ quoted = (alphas.map { |s| %("#{s}") }.join(",") + "\r\n") * n_rows
+ mixed = (alphas.map.with_index { |s, i| i.odd? ? s : %("#{s}") }.join(",") + "\r\n") * n_rows
+ inc_col_sep = (alphas.map { |s| %(",#{s}") }.join(",") + "\r\n") * n_rows
+ inc_row_sep = (alphas.map { |s| %("#{s}\r\n") }.join(",") + "\r\n") * n_rows
+ hiraganas = ["あああああ"] * n_columns
+ enc_utf8 = (hiraganas.join(",") + "\r\n") * n_rows
+ enc_sjis = enc_utf8.encode("Windows-31J")
+benchmark:
+ unquoted: CSV.parse(unquoted)
+ quoted: CSV.parse(quoted)
+ mixed: CSV.parse(mixed)
+ include_col_sep: CSV.parse(inc_col_sep)
+ include_row_sep: CSV.parse(inc_row_sep)
+ encode_utf-8: CSV.parse(enc_utf8)
+ encode_sjis: CSV.parse(enc_sjis)
diff --git a/benchmark/parse_liberal_parsing.yaml b/benchmark/parse_liberal_parsing.yaml
new file mode 100644
index 0000000..dcbf598
--- /dev/null
+++ b/benchmark/parse_liberal_parsing.yaml
@@ -0,0 +1,45 @@
+loop_count: 100
+contexts:
+ - gems:
+ csv: 3.0.2
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ n_columns = Integer(ENV.fetch("N_COLUMNS", "50"), 10)
+ n_rows = Integer(ENV.fetch("N_ROWS", "1000"), 10)
+ alphas = ['\"\"a\"\"'] * n_columns
+ unquoted = (alphas.join(",") + "\r\n") * n_rows
+ quoted = (alphas.map { |s| %("#{s}") }.join(",") + "\r\n") * n_rows
+ inc_col_sep = (alphas.map { |s| %(",#{s}") }.join(",") + "\r\n") * n_rows
+ inc_row_sep = (alphas.map { |s| %("#{s}") }.join(",") + "\r\n") * n_rows
+ hiraganas = ["あああああ"] * n_columns
+ enc_utf8 = (hiraganas.join(",") + "\r\n") * n_rows
+ enc_sjis = enc_utf8.encode("Windows-31J")
+benchmark:
+ unquoted: |-
+ CSV.parse(unquoted, liberal_parsing: true)
+ unquoted_backslash_quote: |-
+ CSV.parse(unquoted, liberal_parsing: {
+ backslash_quote: true,
+ })
+ quoted: |-
+ CSV.parse(quoted, liberal_parsing: true)
+ quoted_double_quote_outside_quote: |-
+ CSV.parse(quoted, liberal_parsing: {
+ double_quote_outside_quote: true
+ })
+ quoted_backslash_quote: |-
+ CSV.parse(quoted, liberal_parsing: {
+ double_quote_outside_quote: true,
+ backslash_quote: true,
+ })
+ include_col_sep: |-
+ CSV.parse(inc_col_sep, liberal_parsing: true)
+ include_row_sep: |-
+ CSV.parse(inc_row_sep, liberal_parsing: true)
+ encode_utf-8: |-
+ CSV.parse(enc_utf8, liberal_parsing: true)
+ encode_sjis: |-
+ CSV.parse(enc_sjis, liberal_parsing: true)
diff --git a/benchmark/parse_quote_char_nil.yaml b/benchmark/parse_quote_char_nil.yaml
new file mode 100644
index 0000000..f92fd33
--- /dev/null
+++ b/benchmark/parse_quote_char_nil.yaml
@@ -0,0 +1,20 @@
+loop_count: 100
+contexts:
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ n_columns = Integer(ENV.fetch("N_COLUMNS", "50"), 10)
+ n_rows = Integer(ENV.fetch("N_ROWS", "1000"), 10)
+ alphas = ["AAAAA"] * n_columns
+ unquoted = (alphas.join(",") + "\r\n") * n_rows
+ col_sep_space = (alphas.join(" ") + "\r\n") * n_rows
+
+benchmark:
+ without_quote_char: |-
+ CSV.parse(unquoted)
+ quote_char_nil: |-
+ CSV.parse(unquoted, quote_char: nil)
+ col_sep_space: |-
+ CSV.parse(col_sep_space, quote_char: nil, col_sep: " ")
diff --git a/benchmark/parse_strip.yaml b/benchmark/parse_strip.yaml
new file mode 100644
index 0000000..a0230fd
--- /dev/null
+++ b/benchmark/parse_strip.yaml
@@ -0,0 +1,17 @@
+loop_count: 100
+contexts:
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ n_columns = Integer(ENV.fetch("N_COLUMNS", "50"), 10)
+ n_rows = Integer(ENV.fetch("N_ROWS", "1000"), 10)
+ alphas = ["AAAAA"] * n_columns
+ quoted = (alphas.map { |s| %("#{s}") }.join(",") + "\r\n") * n_rows
+
+benchmark:
+ default: |-
+ CSV.parse(quoted)
+ no_quote_strip: |-
+ CSV.parse(quoted, quote_char: nil, strip: '"')
diff --git a/benchmark/read.yaml b/benchmark/read.yaml
new file mode 100644
index 0000000..b06dbe1
--- /dev/null
+++ b/benchmark/read.yaml
@@ -0,0 +1,29 @@
+loop_count: 100
+contexts:
+ - gems:
+ csv: 3.0.1
+ - gems:
+ csv: 3.0.2
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ CSV.open("/tmp/file.csv", "w") do |csv|
+ csv << ["player", "gameA", "gameB"]
+ 1000.times do
+ csv << ['"Alice"', "84.0", "79.5"]
+ csv << ['"Bob"', "20.0", "56.5"]
+ end
+ end
+benchmark:
+ "CSV.foreach": |-
+ CSV.foreach("/tmp/file.csv") do |row|
+ end
+ "CSV#shift": |-
+ CSV.open("/tmp/file.csv") do |csv|
+ while _line = csv.shift
+ end
+ end
+ "CSV.read": CSV.read("/tmp/file.csv")
+ "CSV.table": CSV.table("/tmp/file.csv")
diff --git a/benchmark/shift.yaml b/benchmark/shift.yaml
new file mode 100644
index 0000000..eb6fd80
--- /dev/null
+++ b/benchmark/shift.yaml
@@ -0,0 +1,20 @@
+loop_count: 100
+contexts:
+ - gems:
+ csv: 3.0.1
+ - gems:
+ csv: 3.0.2
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ n_columns = Integer(ENV.fetch("N_COLUMNS", "50"), 10)
+ n_rows = Integer(ENV.fetch("N_ROWS", "1000"), 10)
+ alphas = ["AAAAA"] * n_columns
+ data = (alphas.join(",") + "\r\n") * n_rows
+benchmark:
+ shift: |-
+ csv = CSV.new(data)
+ while csv.shift do
+ end
diff --git a/benchmark/write.yaml b/benchmark/write.yaml
new file mode 100644
index 0000000..5b7d194
--- /dev/null
+++ b/benchmark/write.yaml
@@ -0,0 +1,71 @@
+loop_count: 100
+contexts:
+ - gems:
+ csv: 3.0.1
+ - gems:
+ csv: 3.0.2
+ - name: "master"
+ prelude: |
+ $LOAD_PATH.unshift(File.expand_path("lib"))
+ require "csv"
+prelude: |-
+ n_columns = Integer(ENV.fetch("N_COLUMNS", "5"), 10)
+ n_rows = Integer(ENV.fetch("N_ROWS", "100"), 10)
+ fields = ["AAAAA"] * n_columns
+ headers = n_columns.times.collect do |i|
+ "header#{i}"
+ end
+ row = CSV::Row.new(headers, fields)
+ raw_row = {}
+ n_columns.times do |i|
+ raw_row[headers[i]] = fields[i]
+ end
+benchmark:
+ "generate_line: fields": |-
+ n_rows.times do
+ CSV.generate_line(fields)
+ end
+ "generate_line: Row": |-
+ n_rows.times do
+ CSV.generate_line(row)
+ end
+ "generate_line: Hash": |-
+ n_rows.times do
+ CSV.generate_line(raw_row, headers: headers)
+ end
+ "<< fields": |-
+ output = StringIO.new
+ csv = CSV.new(output)
+ n_rows.times do
+ csv << fields
+ end
+ "<< Row": |-
+ output = StringIO.new
+ csv = CSV.new(output)
+ n_rows.times do
+ csv << row
+ end
+ "<< Hash": |-
+ output = StringIO.new
+ csv = CSV.new(output, headers: headers)
+ n_rows.times do
+ csv << raw_row
+ end
+ "<< fields: write headers": |-
+ output = StringIO.new
+ csv = CSV.new(output, headers: headers, write_headers: true)
+ n_rows.times do
+ csv << fields
+ end
+ "<< Row: write headers": |-
+ output = StringIO.new
+ csv = CSV.new(output, headers: headers, write_headers: true)
+ n_rows.times do
+ csv << row
+ end
+ "<< Hash: write headers": |-
+ output = StringIO.new
+ csv = CSV.new(output, headers: headers, write_headers: true)
+ n_rows.times do
+ csv << raw_row
+ end
diff --git a/bin/console b/bin/console
new file mode 100755
index 0000000..954718c
--- /dev/null
+++ b/bin/console
@@ -0,0 +1,14 @@
+#!/usr/bin/env ruby
+
+require "bundler/setup"
+require "csv"
+
+# You can add fixtures and/or initialization code here to make experimenting
+# with your gem easier. You can also use a different console, if you like.
+
+# (If you use this, don't forget to add pry to your Gemfile!)
+# require "pry"
+# Pry.start
+
+require "irb"
+IRB.start(__FILE__)
diff --git a/bin/setup b/bin/setup
new file mode 100755
index 0000000..dce67d8
--- /dev/null
+++ b/bin/setup
@@ -0,0 +1,8 @@
+#!/usr/bin/env bash
+set -euo pipefail
+IFS=$'\n\t'
+set -vx
+
+bundle install
+
+# Do any other automated setup that you need to do here
diff --git a/csv.gemspec b/csv.gemspec
index adc621c..11c5b0f 100644
--- a/csv.gemspec
+++ b/csv.gemspec
@@ -1,41 +1,64 @@
-#########################################################
-# This file has been automatically generated by gem2tgz #
-#########################################################
-# -*- encoding: utf-8 -*-
-# stub: csv 3.2.2 ruby lib
+# frozen_string_literal: true
-Gem::Specification.new do |s|
- s.name = "csv".freeze
- s.version = "3.2.2"
+begin
+ require_relative "lib/csv/version"
+rescue LoadError
+ # for Ruby core repository
+ require_relative "version"
+end
- s.required_rubygems_version = Gem::Requirement.new(">= 0".freeze) if s.respond_to? :required_rubygems_version=
- s.require_paths = ["lib".freeze]
- s.authors = ["James Edward Gray II".freeze, "Kouhei Sutou".freeze]
- s.date = "2021-12-24"
- s.description = "The CSV library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.".freeze
- s.email = [nil, "kou@cozmixng.org".freeze]
- s.extra_rdoc_files = ["LICENSE.txt".freeze, "NEWS.md".freeze, "README.md".freeze, "doc/csv/recipes/filtering.rdoc".freeze, "doc/csv/recipes/generating.rdoc".freeze, "doc/csv/recipes/parsing.rdoc".freeze, "doc/csv/recipes/recipes.rdoc".freeze]
- s.files = ["LICENSE.txt".freeze, "NEWS.md".freeze, "README.md".freeze, "doc/csv/arguments/io.rdoc".freeze, "doc/csv/options/common/col_sep.rdoc".freeze, "doc/csv/options/common/quote_char.rdoc".freeze, "doc/csv/options/common/row_sep.rdoc".freeze, "doc/csv/options/generating/force_quotes.rdoc".freeze, "doc/csv/options/generating/quote_empty.rdoc".freeze, "doc/csv/options/generating/write_converters.rdoc".freeze, "doc/csv/options/generating/write_empty_value.rdoc".freeze, "doc/csv/options/generating/write_headers.rdoc".freeze, "doc/csv/options/generating/write_nil_value.rdoc".freeze, "doc/csv/options/parsing/converters.rdoc".freeze, "doc/csv/options/parsing/empty_value.rdoc".freeze, "doc/csv/options/parsing/field_size_limit.rdoc".freeze, "doc/csv/options/parsing/header_converters.rdoc".freeze, "doc/csv/options/parsing/headers.rdoc".freeze, "doc/csv/options/parsing/liberal_parsing.rdoc".freeze, "doc/csv/options/parsing/nil_value.rdoc".freeze, "doc/csv/options/parsing/return_headers.rdoc".freeze, "doc/csv/options/parsing/skip_blanks.rdoc".freeze, "doc/csv/options/parsing/skip_lines.rdoc".freeze, "doc/csv/options/parsing/strip.rdoc".freeze, "doc/csv/options/parsing/unconverted_fields.rdoc".freeze, "doc/csv/recipes/filtering.rdoc".freeze, "doc/csv/recipes/generating.rdoc".freeze, "doc/csv/recipes/parsing.rdoc".freeze, "doc/csv/recipes/recipes.rdoc".freeze, "lib/csv.rb".freeze, "lib/csv/core_ext/array.rb".freeze, "lib/csv/core_ext/string.rb".freeze, "lib/csv/delete_suffix.rb".freeze, "lib/csv/fields_converter.rb".freeze, "lib/csv/input_record_separator.rb".freeze, "lib/csv/match_p.rb".freeze, "lib/csv/parser.rb".freeze, "lib/csv/row.rb".freeze, "lib/csv/table.rb".freeze, "lib/csv/version.rb".freeze, "lib/csv/writer.rb".freeze]
- s.homepage = "https://github.com/ruby/csv".freeze
- s.licenses = ["Ruby".freeze, "BSD-2-Clause".freeze]
- s.rdoc_options = ["--main".freeze, "README.md".freeze]
- s.required_ruby_version = Gem::Requirement.new(">= 2.5.0".freeze)
- s.rubygems_version = "3.2.5".freeze
- s.summary = "CSV Reading and Writing".freeze
+Gem::Specification.new do |spec|
+ spec.name = "csv"
+ spec.version = CSV::VERSION
+ spec.authors = ["James Edward Gray II", "Kouhei Sutou"]
+ spec.email = [nil, "kou@cozmixng.org"]
- if s.respond_to? :specification_version then
- s.specification_version = 4
- end
+ spec.summary = "CSV Reading and Writing"
+ spec.description = "The CSV library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed."
+ spec.homepage = "https://github.com/ruby/csv"
+ spec.licenses = ["Ruby", "BSD-2-Clause"]
- if s.respond_to? :add_runtime_dependency then
- s.add_development_dependency(%q<benchmark_driver>.freeze, [">= 0"])
- s.add_development_dependency(%q<bundler>.freeze, [">= 0"])
- s.add_development_dependency(%q<rake>.freeze, [">= 0"])
- s.add_development_dependency(%q<test-unit>.freeze, [">= 3.4.8"])
- else
- s.add_dependency(%q<benchmark_driver>.freeze, [">= 0"])
- s.add_dependency(%q<bundler>.freeze, [">= 0"])
- s.add_dependency(%q<rake>.freeze, [">= 0"])
- s.add_dependency(%q<test-unit>.freeze, [">= 3.4.8"])
+ lib_path = "lib"
+ spec.require_paths = [lib_path]
+ files = []
+ lib_dir = File.join(__dir__, lib_path)
+ if File.exist?(lib_dir)
+ Dir.chdir(lib_dir) do
+ Dir.glob("**/*.rb").each do |file|
+ files << "lib/#{file}"
+ end
+ end
+ end
+ doc_dir = File.join(__dir__, "doc")
+ if File.exist?(doc_dir)
+ Dir.chdir(doc_dir) do
+ Dir.glob("**/*.rdoc").each do |rdoc_file|
+ files << "doc/#{rdoc_file}"
+ end
+ end
end
+ spec.files = files
+ spec.rdoc_options.concat(["--main", "README.md"])
+ rdoc_files = [
+ "LICENSE.txt",
+ "NEWS.md",
+ "README.md",
+ ]
+ recipes_dir = File.join(doc_dir, "csv", "recipes")
+ if File.exist?(recipes_dir)
+ Dir.chdir(recipes_dir) do
+ Dir.glob("**/*.rdoc").each do |recipe_file|
+ rdoc_files << "doc/csv/recipes/#{recipe_file}"
+ end
+ end
+ end
+ spec.extra_rdoc_files = rdoc_files
+
+ spec.required_ruby_version = ">= 2.5.0"
+
+ # spec.add_dependency "stringio", ">= 0.1.3"
+ spec.add_development_dependency "bundler"
+ spec.add_development_dependency "rake"
+ spec.add_development_dependency "benchmark_driver"
+ spec.add_development_dependency "test-unit", ">= 3.4.8"
end
diff --git a/debian/changelog b/debian/changelog
index c7207a1..d2adfbe 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,9 +1,10 @@
-ruby-csv (3.2.2-2) UNRELEASED; urgency=medium
+ruby-csv (3.2.6+git20230128.1.e5622c5-1) UNRELEASED; urgency=medium
* Update standards version to 4.6.0, no changes needed.
* Update standards version to 4.6.1, no changes needed.
+ * New upstream snapshot.
- -- Debian Janitor <janitor@jelmer.uk> Thu, 28 Jul 2022 23:29:10 -0000
+ -- Debian Janitor <janitor@jelmer.uk> Thu, 09 Feb 2023 16:52:22 -0000
ruby-csv (3.2.2-1) unstable; urgency=medium
diff --git a/doc/csv/options/generating/write_headers.rdoc b/doc/csv/options/generating/write_headers.rdoc
index f9faa9d..c56aa48 100644
--- a/doc/csv/options/generating/write_headers.rdoc
+++ b/doc/csv/options/generating/write_headers.rdoc
@@ -19,7 +19,7 @@ Without +write_headers+:
With +write_headers+":
CSV.open(file_path,'w',
- :write_headers=> true,
+ :write_headers => true,
:headers => ['Name','Value']
) do |csv|
csv << ['foo', '0']
diff --git a/doc/csv/recipes/generating.rdoc b/doc/csv/recipes/generating.rdoc
index 6984339..9320d53 100644
--- a/doc/csv/recipes/generating.rdoc
+++ b/doc/csv/recipes/generating.rdoc
@@ -148,7 +148,7 @@ This example defines and uses a custom write converter to strip whitespace from
==== Recipe: Specify Multiple Write Converters
-Use option <tt>:write_converters</tt> and multiple custom coverters
+Use option <tt>:write_converters</tt> and multiple custom converters
to convert field values when generating \CSV.
This example defines and uses two custom write converters to strip and upcase generated fields:
diff --git a/doc/csv/recipes/parsing.rdoc b/doc/csv/recipes/parsing.rdoc
index ad8a57c..fc116fc 100644
--- a/doc/csv/recipes/parsing.rdoc
+++ b/doc/csv/recipes/parsing.rdoc
@@ -83,7 +83,7 @@ Use instance method CSV#each with option +headers+ to read a source \String one
CSV.new(string, headers: true).each do |row|
p row
end
-Ouput:
+Output:
#<CSV::Row "Name":"foo" "Value":"0">
#<CSV::Row "Name":"bar" "Value":"1">
#<CSV::Row "Name":"baz" "Value":"2">
diff --git a/lib/csv.rb b/lib/csv.rb
index 2c47ead..0307033 100644
--- a/lib/csv.rb
+++ b/lib/csv.rb
@@ -95,14 +95,11 @@ require "stringio"
require_relative "csv/fields_converter"
require_relative "csv/input_record_separator"
-require_relative "csv/match_p"
require_relative "csv/parser"
require_relative "csv/row"
require_relative "csv/table"
require_relative "csv/writer"
-using CSV::MatchP if CSV.const_defined?(:MatchP)
-
# == \CSV
#
# === In a Hurry?
@@ -357,7 +354,9 @@ using CSV::MatchP if CSV.const_defined?(:MatchP)
# - +row_sep+: Specifies the row separator; used to delimit rows.
# - +col_sep+: Specifies the column separator; used to delimit fields.
# - +quote_char+: Specifies the quote character; used to quote fields.
-# - +field_size_limit+: Specifies the maximum field size allowed.
+# - +field_size_limit+: Specifies the maximum field size + 1 allowed.
+# Deprecated since 3.2.3. Use +max_field_size+ instead.
+# - +max_field_size+: Specifies the maximum field size allowed.
# - +converters+: Specifies the field converters to be used.
# - +unconverted_fields+: Specifies whether unconverted fields are to be available.
# - +headers+: Specifies whether data contains headers,
@@ -864,8 +863,9 @@ class CSV
# <b><tt>index</tt></b>:: The zero-based index of the field in its row.
# <b><tt>line</tt></b>:: The line of the data source this row is from.
# <b><tt>header</tt></b>:: The header for the column, when available.
+ # <b><tt>quoted?</tt></b>:: True or false, whether the original value is quoted or not.
#
- FieldInfo = Struct.new(:index, :line, :header)
+ FieldInfo = Struct.new(:index, :line, :header, :quoted?)
# A Regexp used to find and convert some common Date formats.
DateMatcher = / \A(?: (\w+,?\s+)?\w+\s+\d{1,2},?\s+\d{2,4} |
@@ -873,10 +873,9 @@ class CSV
# A Regexp used to find and convert some common DateTime formats.
DateTimeMatcher =
/ \A(?: (\w+,?\s+)?\w+\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2},?\s+\d{2,4} |
- \d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2} |
- # ISO-8601
+ # ISO-8601 and RFC-3339 (space instead of T) recognized by DateTime.parse
\d{4}-\d{2}-\d{2}
- (?:T\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
+ (?:[T\s]\d{2}:\d{2}(?::\d{2}(?:\.\d+)?(?:[+-]\d{2}(?::\d{2})|Z)?)?)?
)\z /x
# The encoding used by all converters.
@@ -926,7 +925,8 @@ class CSV
symbol: lambda { |h|
h.encode(ConverterEncoding).downcase.gsub(/[^\s\w]+/, "").strip.
gsub(/\s+/, "_").to_sym
- }
+ },
+ symbol_raw: lambda { |h| h.encode(ConverterEncoding).to_sym }
}
# Default values for method options.
@@ -937,6 +937,7 @@ class CSV
quote_char: '"',
# For parsing.
field_size_limit: nil,
+ max_field_size: nil,
converters: nil,
unconverted_fields: nil,
headers: false,
@@ -1004,7 +1005,7 @@ class CSV
def instance(data = $stdout, **options)
# create a _signature_ for this method call, data object and options
sig = [data.object_id] +
- options.values_at(*DEFAULT_OPTIONS.keys.sort_by { |sym| sym.to_s })
+ options.values_at(*DEFAULT_OPTIONS.keys)
# fetch or create the instance for this signature
@@instances ||= Hash.new
@@ -1143,7 +1144,7 @@ class CSV
# File.read('t.csv') # => "Name,Value\nFOO,0\nBAR,-1\nBAZ,-2\n"
#
# When neither +in_string_or_io+ nor +out_string_or_io+ given,
- # parses from {ARGF}[https://docs.ruby-lang.org/en/master/ARGF.html]
+ # parses from {ARGF}[rdoc-ref:ARGF]
# and generates to STDOUT.
#
# Without headers:
@@ -1201,7 +1202,7 @@ class CSV
# parse options for input, output, or both
in_options, out_options = Hash.new, {row_sep: InputRecordSeparator.value}
options.each do |key, value|
- case key.to_s
+ case key
when /\Ain(?:put)?_(.+)\Z/
in_options[$1.to_sym] = value
when /\Aout(?:put)?_(.+)\Z/
@@ -1464,6 +1465,46 @@ class CSV
(new(str, **options) << row).string
end
+ # :call-seq:
+ # CSV.generate_lines(rows)
+ # CSV.generate_lines(rows, **options)
+ #
+ # Returns the \String created by generating \CSV from
+ # using the specified +options+.
+ #
+ # Argument +rows+ must be an \Array of row. Row is \Array of \String or \CSV::Row.
+ #
+ # Special options:
+ # * Option <tt>:row_sep</tt> defaults to <tt>"\n"</tt> on Ruby 3.0 or later
+ # and <tt>$INPUT_RECORD_SEPARATOR</tt> (<tt>$/</tt>) otherwise.:
+ # $INPUT_RECORD_SEPARATOR # => "\n"
+ # * This method accepts an additional option, <tt>:encoding</tt>, which sets the base
+ # Encoding for the output. This method will try to guess your Encoding from
+ # the first non-+nil+ field in +row+, if possible, but you may need to use
+ # this parameter as a backup plan.
+ #
+ # For other +options+,
+ # see {Options for Generating}[#class-CSV-label-Options+for+Generating].
+ #
+ # ---
+ #
+ # Returns the \String generated from an
+ # CSV.generate_lines([['foo', '0'], ['bar', '1'], ['baz', '2']]) # => "foo,0\nbar,1\nbaz,2\n"
+ #
+ # ---
+ #
+ # Raises an exception
+ # # Raises NoMethodError (undefined method `each' for :foo:Symbol)
+ # CSV.generate_lines(:foo)
+ #
+ def generate_lines(rows, **options)
+ self.generate(**options) do |csv|
+ rows.each do |row|
+ csv << row
+ end
+ end
+ end
+
#
# :call-seq:
# open(file_path, mode = "rb", **options ) -> new_csv
@@ -1865,6 +1906,7 @@ class CSV
row_sep: :auto,
quote_char: '"',
field_size_limit: nil,
+ max_field_size: nil,
converters: nil,
unconverted_fields: nil,
headers: false,
@@ -1888,8 +1930,19 @@ class CSV
raise ArgumentError.new("Cannot parse nil as CSV") if data.nil?
if data.is_a?(String)
+ if encoding
+ if encoding.is_a?(String)
+ data_external_encoding, data_internal_encoding = encoding.split(":", 2)
+ if data_internal_encoding
+ data = data.encode(data_internal_encoding, data_external_encoding)
+ else
+ data = data.dup.force_encoding(data_external_encoding)
+ end
+ else
+ data = data.dup.force_encoding(encoding)
+ end
+ end
@io = StringIO.new(data)
- @io.set_encoding(encoding || data.encoding)
else
@io = data
end
@@ -1907,11 +1960,14 @@ class CSV
@initial_header_converters = header_converters
@initial_write_converters = write_converters
+ if max_field_size.nil? and field_size_limit
+ max_field_size = field_size_limit - 1
+ end
@parser_options = {
column_separator: col_sep,
row_separator: row_sep,
quote_character: quote_char,
- field_size_limit: field_size_limit,
+ max_field_size: max_field_size,
unconverted_fields: unconverted_fields,
headers: headers,
return_headers: return_headers,
@@ -1979,10 +2035,24 @@ class CSV
# Returns the limit for field size; used for parsing;
# see {Option +field_size_limit+}[#class-CSV-label-Option+field_size_limit]:
# CSV.new('').field_size_limit # => nil
+ #
+ # Deprecated since 3.2.3. Use +max_field_size+ instead.
def field_size_limit
parser.field_size_limit
end
+ # :call-seq:
+ # csv.max_field_size -> integer or nil
+ #
+ # Returns the limit for field size; used for parsing;
+ # see {Option +max_field_size+}[#class-CSV-label-Option+max_field_size]:
+ # CSV.new('').max_field_size # => nil
+ #
+ # Since 3.2.3.
+ def max_field_size
+ parser.max_field_size
+ end
+
# :call-seq:
# csv.skip_lines -> regexp or nil
#
@@ -2481,7 +2551,13 @@ class CSV
# p row
# end
def each(&block)
- parser_enumerator.each(&block)
+ return to_enum(__method__) unless block_given?
+ begin
+ while true
+ yield(parser_enumerator.next)
+ end
+ rescue StopIteration
+ end
end
# :call-seq:
diff --git a/lib/csv/delete_suffix.rb b/lib/csv/delete_suffix.rb
deleted file mode 100644
index d457718..0000000
--- a/lib/csv/delete_suffix.rb
+++ /dev/null
@@ -1,18 +0,0 @@
-# frozen_string_literal: true
-
-# This provides String#delete_suffix? for Ruby 2.4.
-unless String.method_defined?(:delete_suffix)
- class CSV
- module DeleteSuffix
- refine String do
- def delete_suffix(suffix)
- if end_with?(suffix)
- self[0...-suffix.size]
- else
- self
- end
- end
- end
- end
- end
-end
diff --git a/lib/csv/fields_converter.rb b/lib/csv/fields_converter.rb
index b206118..d15977d 100644
--- a/lib/csv/fields_converter.rb
+++ b/lib/csv/fields_converter.rb
@@ -44,7 +44,7 @@ class CSV
@converters.empty?
end
- def convert(fields, headers, lineno)
+ def convert(fields, headers, lineno, quoted_fields)
return fields unless need_convert?
fields.collect.with_index do |field, index|
@@ -63,7 +63,8 @@ class CSV
else
header = nil
end
- field = converter[field, FieldInfo.new(index, lineno, header)]
+ quoted = quoted_fields[index]
+ field = converter[field, FieldInfo.new(index, lineno, header, quoted)]
end
break unless field.is_a?(String) # short-circuit pipeline for speed
end
diff --git a/lib/csv/input_record_separator.rb b/lib/csv/input_record_separator.rb
index bbf1347..7a99343 100644
--- a/lib/csv/input_record_separator.rb
+++ b/lib/csv/input_record_separator.rb
@@ -4,20 +4,7 @@ require "stringio"
class CSV
module InputRecordSeparator
class << self
- is_input_record_separator_deprecated = false
- verbose, $VERBOSE = $VERBOSE, true
- stderr, $stderr = $stderr, StringIO.new
- input_record_separator = $INPUT_RECORD_SEPARATOR
- begin
- $INPUT_RECORD_SEPARATOR = "\r\n"
- is_input_record_separator_deprecated = (not $stderr.string.empty?)
- ensure
- $INPUT_RECORD_SEPARATOR = input_record_separator
- $stderr = stderr
- $VERBOSE = verbose
- end
-
- if is_input_record_separator_deprecated
+ if RUBY_VERSION >= "3.0.0"
def value
"\n"
end
diff --git a/lib/csv/match_p.rb b/lib/csv/match_p.rb
deleted file mode 100644
index 775559a..0000000
--- a/lib/csv/match_p.rb
+++ /dev/null
@@ -1,20 +0,0 @@
-# frozen_string_literal: true
-
-# This provides String#match? and Regexp#match? for Ruby 2.3.
-unless String.method_defined?(:match?)
- class CSV
- module MatchP
- refine String do
- def match?(pattern)
- self =~ pattern
- end
- end
-
- refine Regexp do
- def match?(string)
- self =~ string
- end
- end
- end
- end
-end
diff --git a/lib/csv/parser.rb b/lib/csv/parser.rb
index 7e943ac..1f8b150 100644
--- a/lib/csv/parser.rb
+++ b/lib/csv/parser.rb
@@ -2,15 +2,10 @@
require "strscan"
-require_relative "delete_suffix"
require_relative "input_record_separator"
-require_relative "match_p"
require_relative "row"
require_relative "table"
-using CSV::DeleteSuffix if CSV.const_defined?(:DeleteSuffix)
-using CSV::MatchP if CSV.const_defined?(:MatchP)
-
class CSV
# Note: Don't use this class directly. This is an internal class.
class Parser
@@ -27,6 +22,10 @@ class CSV
class InvalidEncoding < StandardError
end
+ # Raised when unexpected case is happen.
+ class UnexpectedError < StandardError
+ end
+
#
# CSV::Scanner receives a CSV output, scans it and return the content.
# It also controls the life cycle of the object with its methods +keep_start+,
@@ -78,10 +77,10 @@ class CSV
# +keep_end+, +keep_back+, +keep_drop+.
#
# CSV::InputsScanner.scan() tries to match with pattern at the current position.
- # If there's a match, the scanner advances the “scan pointer” and returns the matched string.
+ # If there's a match, the scanner advances the "scan pointer" and returns the matched string.
# Otherwise, the scanner returns nil.
#
- # CSV::InputsScanner.rest() returns the “rest” of the string (i.e. everything after the scan pointer).
+ # CSV::InputsScanner.rest() returns the "rest" of the string (i.e. everything after the scan pointer).
# If there is no more data (eos? = true), it returns "".
#
class InputsScanner
@@ -96,11 +95,13 @@ class CSV
end
def each_line(row_separator)
+ return enum_for(__method__, row_separator) unless block_given?
buffer = nil
input = @scanner.rest
position = @scanner.pos
offset = 0
n_row_separator_chars = row_separator.size
+ # trace(__method__, :start, line, input)
while true
input.each_line(row_separator) do |line|
@scanner.pos += line.bytesize
@@ -140,25 +141,28 @@ class CSV
end
def scan(pattern)
+ # trace(__method__, pattern, :start)
value = @scanner.scan(pattern)
+ # trace(__method__, pattern, :done, :last, value) if @last_scanner
return value if @last_scanner
- if value
- read_chunk if @scanner.eos?
- return value
- else
- nil
- end
+ read_chunk if value and @scanner.eos?
+ # trace(__method__, pattern, :done, value)
+ value
end
def scan_all(pattern)
+ # trace(__method__, pattern, :start)
value = @scanner.scan(pattern)
+ # trace(__method__, pattern, :done, :last, value) if @last_scanner
return value if @last_scanner
return nil if value.nil?
while @scanner.eos? and read_chunk and (sub_value = @scanner.scan(pattern))
+ # trace(__method__, pattern, :sub, sub_value)
value << sub_value
end
+ # trace(__method__, pattern, :done, value)
value
end
@@ -167,68 +171,126 @@ class CSV
end
def keep_start
- @keeps.push([@scanner.pos, nil])
+ # trace(__method__, :start)
+ adjust_last_keep
+ @keeps.push([@scanner, @scanner.pos, nil])
+ # trace(__method__, :done)
end
def keep_end
- start, buffer = @keeps.pop
- keep = @scanner.string.byteslice(start, @scanner.pos - start)
+ # trace(__method__, :start)
+ scanner, start, buffer = @keeps.pop
+ if scanner == @scanner
+ keep = @scanner.string.byteslice(start, @scanner.pos - start)
+ else
+ keep = @scanner.string.byteslice(0, @scanner.pos)
+ end
if buffer
buffer << keep
keep = buffer
end
+ # trace(__method__, :done, keep)
keep
end
def keep_back
- start, buffer = @keeps.pop
+ # trace(__method__, :start)
+ scanner, start, buffer = @keeps.pop
if buffer
+ # trace(__method__, :rescan, start, buffer)
string = @scanner.string
- keep = string.byteslice(start, string.bytesize - start)
+ if scanner == @scanner
+ keep = string.byteslice(start, string.bytesize - start)
+ else
+ keep = string
+ end
if keep and not keep.empty?
@inputs.unshift(StringIO.new(keep))
@last_scanner = false
end
@scanner = StringScanner.new(buffer)
else
+ if @scanner != scanner
+ message = "scanners are different but no buffer: "
+ message += "#{@scanner.inspect}(#{@scanner.object_id}): "
+ message += "#{scanner.inspect}(#{scanner.object_id})"
+ raise UnexpectedError, message
+ end
+ # trace(__method__, :repos, start, buffer)
@scanner.pos = start
end
read_chunk if @scanner.eos?
end
def keep_drop
- @keeps.pop
+ _, _, buffer = @keeps.pop
+ # trace(__method__, :done, :empty) unless buffer
+ return unless buffer
+
+ last_keep = @keeps.last
+ # trace(__method__, :done, :no_last_keep) unless last_keep
+ return unless last_keep
+
+ if last_keep[2]
+ last_keep[2] << buffer
+ else
+ last_keep[2] = buffer
+ end
+ # trace(__method__, :done)
end
def rest
@scanner.rest
end
+ def check(pattern)
+ @scanner.check(pattern)
+ end
+
private
- def read_chunk
- return false if @last_scanner
+ def trace(*args)
+ pp([*args, @scanner, @scanner&.string, @scanner&.pos, @keeps])
+ end
- unless @keeps.empty?
- keep = @keeps.last
- keep_start = keep[0]
- string = @scanner.string
- keep_data = string.byteslice(keep_start, @scanner.pos - keep_start)
- if keep_data
- keep_buffer = keep[1]
- if keep_buffer
- keep_buffer << keep_data
- else
- keep[1] = keep_data.dup
- end
+ def adjust_last_keep
+ # trace(__method__, :start)
+
+ keep = @keeps.last
+ # trace(__method__, :done, :empty) if keep.nil?
+ return if keep.nil?
+
+ scanner, start, buffer = keep
+ string = @scanner.string
+ if @scanner != scanner
+ start = 0
+ end
+ if start == 0 and @scanner.eos?
+ keep_data = string
+ else
+ keep_data = string.byteslice(start, @scanner.pos - start)
+ end
+ if keep_data
+ if buffer
+ buffer << keep_data
+ else
+ keep[2] = keep_data.dup
end
- keep[0] = 0
end
+ # trace(__method__, :done)
+ end
+
+ def read_chunk
+ return false if @last_scanner
+
+ adjust_last_keep
+
input = @inputs.first
case input
when StringIO
string = input.read
raise InvalidEncoding unless string.valid_encoding?
+ # trace(__method__, :stringio, string)
@scanner = StringScanner.new(string)
@inputs.shift
@last_scanner = @inputs.empty?
@@ -237,6 +299,7 @@ class CSV
chunk = input.gets(@row_separator, @chunk_size)
if chunk
raise InvalidEncoding unless chunk.valid_encoding?
+ # trace(__method__, :chunk, chunk)
@scanner = StringScanner.new(chunk)
if input.respond_to?(:eof?) and input.eof?
@inputs.shift
@@ -244,6 +307,7 @@ class CSV
end
true
else
+ # trace(__method__, :no_chunk)
@scanner = StringScanner.new("".encode(@encoding))
@inputs.shift
@last_scanner = @inputs.empty?
@@ -278,7 +342,11 @@ class CSV
end
def field_size_limit
- @field_size_limit
+ @max_field_size&.succ
+ end
+
+ def max_field_size
+ @max_field_size
end
def skip_lines
@@ -346,6 +414,16 @@ class CSV
end
message = "Invalid byte sequence in #{@encoding}"
raise MalformedCSVError.new(message, lineno)
+ rescue UnexpectedError => error
+ if @scanner
+ ignore_broken_line
+ lineno = @lineno
+ else
+ lineno = @lineno + 1
+ end
+ message = "This should not be happen: #{error.message}: "
+ message += "Please report this to https://github.com/ruby/csv/issues"
+ raise MalformedCSVError.new(message, lineno)
end
end
@@ -390,7 +468,7 @@ class CSV
@backslash_quote = false
end
@unconverted_fields = @options[:unconverted_fields]
- @field_size_limit = @options[:field_size_limit]
+ @max_field_size = @options[:max_field_size]
@skip_blanks = @options[:skip_blanks]
@fields_converter = @options[:fields_converter]
@header_fields_converter = @options[:header_fields_converter]
@@ -407,7 +485,6 @@ class CSV
message = ":quote_char has to be nil or a single character String"
raise ArgumentError, message
end
- @double_quote_character = @quote_character * 2
@escaped_quote_character = Regexp.escape(@quote_character)
@escaped_quote = Regexp.new(@escaped_quote_character)
end
@@ -680,9 +757,10 @@ class CSV
case headers
when Array
@raw_headers = headers
+ quoted_fields = [false] * @raw_headers.size
@use_headers = true
when String
- @raw_headers = parse_headers(headers)
+ @raw_headers, quoted_fields = parse_headers(headers)
@use_headers = true
when nil, false
@raw_headers = nil
@@ -692,21 +770,28 @@ class CSV
@use_headers = true
end
if @raw_headers
- @headers = adjust_headers(@raw_headers)
+ @headers = adjust_headers(@raw_headers, quoted_fields)
else
@headers = nil
end
end
def parse_headers(row)
- CSV.parse_line(row,
- col_sep: @column_separator,
- row_sep: @row_separator,
- quote_char: @quote_character)
+ quoted_fields = []
+ converter = lambda do |field, info|
+ quoted_fields << info.quoted?
+ field
+ end
+ headers = CSV.parse_line(row,
+ col_sep: @column_separator,
+ row_sep: @row_separator,
+ quote_char: @quote_character,
+ converters: [converter])
+ [headers, quoted_fields]
end
- def adjust_headers(headers)
- adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno)
+ def adjust_headers(headers, quoted_fields)
+ adjusted_headers = @header_fields_converter.convert(headers, nil, @lineno, quoted_fields)
adjusted_headers.each {|h| h.freeze if h.is_a? String}
adjusted_headers
end
@@ -729,28 +814,28 @@ class CSV
sample[0, 128].index(@quote_character)
end
- SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
- if SCANNER_TEST
- class UnoptimizedStringIO
- def initialize(string)
- @io = StringIO.new(string, "rb:#{string.encoding}")
- end
+ class UnoptimizedStringIO # :nodoc:
+ def initialize(string)
+ @io = StringIO.new(string, "rb:#{string.encoding}")
+ end
- def gets(*args)
- @io.gets(*args)
- end
+ def gets(*args)
+ @io.gets(*args)
+ end
- def each_line(*args, &block)
- @io.each_line(*args, &block)
- end
+ def each_line(*args, &block)
+ @io.each_line(*args, &block)
+ end
- def eof?
- @io.eof?
- end
+ def eof?
+ @io.eof?
end
+ end
- SCANNER_TEST_CHUNK_SIZE =
- Integer((ENV["CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"] || "1"), 10)
+ SCANNER_TEST = (ENV["CSV_PARSER_SCANNER_TEST"] == "yes")
+ if SCANNER_TEST
+ SCANNER_TEST_CHUNK_SIZE_NAME = "CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"
+ SCANNER_TEST_CHUNK_SIZE_VALUE = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
def build_scanner
inputs = @samples.collect do |sample|
UnoptimizedStringIO.new(sample)
@@ -760,10 +845,17 @@ class CSV
else
inputs << @input
end
+ begin
+ chunk_size_value = ENV[SCANNER_TEST_CHUNK_SIZE_NAME]
+ rescue # Ractor::IsolationError
+ # Ractor on Ruby 3.0 can't read ENV value.
+ chunk_size_value = SCANNER_TEST_CHUNK_SIZE_VALUE
+ end
+ chunk_size = Integer((chunk_size_value || "1"), 10)
InputsScanner.new(inputs,
@encoding,
@row_separator,
- chunk_size: SCANNER_TEST_CHUNK_SIZE)
+ chunk_size: chunk_size)
end
else
def build_scanner
@@ -826,6 +918,14 @@ class CSV
end
end
+ def validate_field_size(field)
+ return unless @max_field_size
+ return if field.size <= @max_field_size
+ ignore_broken_line
+ message = "Field size exceeded: #{field.size} > #{@max_field_size}"
+ raise MalformedCSVError.new(message, @lineno)
+ end
+
def parse_no_quote(&block)
@scanner.each_line(@row_separator) do |line|
next if @skip_lines and skip_line?(line)
@@ -835,9 +935,16 @@ class CSV
if line.empty?
next if @skip_blanks
row = []
+ quoted_fields = []
else
line = strip_value(line)
row = line.split(@split_column_separator, -1)
+ quoted_fields = [false] * row.size
+ if @max_field_size
+ row.each do |column|
+ validate_field_size(column)
+ end
+ end
n_columns = row.size
i = 0
while i < n_columns
@@ -846,7 +953,7 @@ class CSV
end
end
@last_line = original_line
- emit_row(row, &block)
+ emit_row(row, quoted_fields, &block)
end
end
@@ -868,31 +975,37 @@ class CSV
next
end
row = []
+ quoted_fields = []
elsif line.include?(@cr) or line.include?(@lf)
@scanner.keep_back
@need_robust_parsing = true
return parse_quotable_robust(&block)
else
row = line.split(@split_column_separator, -1)
+ quoted_fields = []
n_columns = row.size
i = 0
while i < n_columns
column = row[i]
if column.empty?
+ quoted_fields << false
row[i] = nil
else
n_quotes = column.count(@quote_character)
if n_quotes.zero?
+ quoted_fields << false
# no quote
elsif n_quotes == 2 and
column.start_with?(@quote_character) and
column.end_with?(@quote_character)
+ quoted_fields << true
row[i] = column[1..-2]
else
@scanner.keep_back
@need_robust_parsing = true
return parse_quotable_robust(&block)
end
+ validate_field_size(row[i])
end
i += 1
end
@@ -900,13 +1013,14 @@ class CSV
@scanner.keep_drop
@scanner.keep_start
@last_line = original_line
- emit_row(row, &block)
+ emit_row(row, quoted_fields, &block)
end
@scanner.keep_drop
end
def parse_quotable_robust(&block)
row = []
+ quoted_fields = []
skip_needless_lines
start_row
while true
@@ -916,32 +1030,39 @@ class CSV
value = parse_column_value
if value
@scanner.scan_all(@strip_value) if @strip_value
- if @field_size_limit and value.size >= @field_size_limit
- ignore_broken_line
- raise MalformedCSVError.new("Field size exceeded", @lineno)
- end
+ validate_field_size(value)
end
if parse_column_end
row << value
+ quoted_fields << @quoted_column_value
elsif parse_row_end
if row.empty? and value.nil?
- emit_row([], &block) unless @skip_blanks
+ emit_row([], [], &block) unless @skip_blanks
else
row << value
- emit_row(row, &block)
+ quoted_fields << @quoted_column_value
+ emit_row(row, quoted_fields, &block)
row = []
+ quoted_fields = []
end
skip_needless_lines
start_row
elsif @scanner.eos?
break if row.empty? and value.nil?
row << value
- emit_row(row, &block)
+ quoted_fields << @quoted_column_value
+ emit_row(row, quoted_fields, &block)
break
else
if @quoted_column_value
+ if liberal_parsing? and (new_line = @scanner.check(@line_end))
+ message =
+ "Illegal end-of-line sequence outside of a quoted field " +
+ "<#{new_line.inspect}>"
+ else
+ message = "Any value after quoted field isn't allowed"
+ end
ignore_broken_line
- message = "Any value after quoted field isn't allowed"
raise MalformedCSVError.new(message, @lineno)
elsif @unquoted_column_value and
(new_line = @scanner.scan(@line_end))
@@ -1034,7 +1155,7 @@ class CSV
if (n_quotes % 2).zero?
quotes[0, (n_quotes - 2) / 2]
else
- value = quotes[0, (n_quotes - 1) / 2]
+ value = quotes[0, n_quotes / 2]
while true
quoted_value = @scanner.scan_all(@quoted_value)
value << quoted_value if quoted_value
@@ -1058,11 +1179,9 @@ class CSV
n_quotes = quotes.size
if n_quotes == 1
break
- elsif (n_quotes % 2) == 1
- value << quotes[0, (n_quotes - 1) / 2]
- break
else
value << quotes[0, n_quotes / 2]
+ break if (n_quotes % 2) == 1
end
end
value
@@ -1098,18 +1217,15 @@ class CSV
def strip_value(value)
return value unless @strip
- return nil if value.nil?
+ return value if value.nil?
case @strip
when String
- size = value.size
- while value.start_with?(@strip)
- size -= 1
- value = value[1, size]
+ while value.delete_prefix!(@strip)
+ # do nothing
end
- while value.end_with?(@strip)
- size -= 1
- value = value[0, size]
+ while value.delete_suffix!(@strip)
+ # do nothing
end
else
value.strip!
@@ -1132,22 +1248,22 @@ class CSV
@scanner.keep_start
end
- def emit_row(row, &block)
+ def emit_row(row, quoted_fields, &block)
@lineno += 1
raw_row = row
if @use_headers
if @headers.nil?
- @headers = adjust_headers(row)
+ @headers = adjust_headers(row, quoted_fields)
return unless @return_headers
row = Row.new(@headers, row, true)
else
row = Row.new(@headers,
- @fields_converter.convert(raw_row, @headers, @lineno))
+ @fields_converter.convert(raw_row, @headers, @lineno, quoted_fields))
end
else
# convert fields, if needed...
- row = @fields_converter.convert(raw_row, nil, @lineno)
+ row = @fields_converter.convert(raw_row, nil, @lineno, quoted_fields)
end
# inject unconverted fields and accessor, if requested...
diff --git a/lib/csv/row.rb b/lib/csv/row.rb
index 62e429f..86323f7 100644
--- a/lib/csv/row.rb
+++ b/lib/csv/row.rb
@@ -703,7 +703,7 @@ class CSV
# by +index_or_header+ and +specifiers+.
#
# The nested objects may be instances of various classes.
- # See {Dig Methods}[https://docs.ruby-lang.org/en/master/doc/dig_methods_rdoc.html].
+ # See {Dig Methods}[rdoc-ref:dig_methods.rdoc].
#
# Examples:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
diff --git a/lib/csv/table.rb b/lib/csv/table.rb
index c5daf1a..fb19f54 100644
--- a/lib/csv/table.rb
+++ b/lib/csv/table.rb
@@ -890,9 +890,8 @@ class CSV
if @mode == :row or @mode == :col_or_row # by index
@table.delete_if(&block)
else # by header
- deleted = []
headers.each do |header|
- deleted << delete(header) if yield([header, self[header]])
+ delete(header) if yield([header, self[header]])
end
end
@@ -999,9 +998,15 @@ class CSV
# Omits the headers if option +write_headers+ is given as +false+
# (see {Option +write_headers+}[../CSV.html#class-CSV-label-Option+write_headers]):
# table.to_csv(write_headers: false) # => "foo,0\nbar,1\nbaz,2\n"
- def to_csv(write_headers: true, **options)
+ #
+ # Limit rows if option +limit+ is given like +2+:
+ # table.to_csv(limit: 2) # => "Name,Value\nfoo,0\nbar,1\n"
+ def to_csv(write_headers: true, limit: nil, **options)
array = write_headers ? [headers.to_csv(**options)] : []
- @table.each do |row|
+ limit ||= @table.size
+ limit = @table.size + 1 + limit if limit < 0
+ limit = 0 if limit < 0
+ @table.first(limit).each do |row|
array.push(row.fields.to_csv(**options)) unless row.header_row?
end
@@ -1038,9 +1043,13 @@ class CSV
# Example:
# source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
# table = CSV.parse(source, headers: true)
- # table.inspect # => "#<CSV::Table mode:col_or_row row_count:4>"
+ # table.inspect # => "#<CSV::Table mode:col_or_row row_count:4>\nName,Value\nfoo,0\nbar,1\nbaz,2\n"
+ #
def inspect
- "#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>".encode("US-ASCII")
+ inspected = +"#<#{self.class} mode:#{@mode} row_count:#{to_a.size}>"
+ summary = to_csv(limit: 5)
+ inspected << "\n" << summary if summary.encoding.ascii_compatible?
+ inspected
end
end
end
diff --git a/lib/csv/version.rb b/lib/csv/version.rb
index d1d0dc0..edafc6f 100644
--- a/lib/csv/version.rb
+++ b/lib/csv/version.rb
@@ -2,5 +2,5 @@
class CSV
# The version of the installed library.
- VERSION = "3.2.2"
+ VERSION = "3.2.7"
end
diff --git a/lib/csv/writer.rb b/lib/csv/writer.rb
index 4a9a35c..030a295 100644
--- a/lib/csv/writer.rb
+++ b/lib/csv/writer.rb
@@ -1,11 +1,8 @@
# frozen_string_literal: true
require_relative "input_record_separator"
-require_relative "match_p"
require_relative "row"
-using CSV::MatchP if CSV.const_defined?(:MatchP)
-
class CSV
# Note: Don't use this class directly. This is an internal class.
class Writer
@@ -42,7 +39,10 @@ class CSV
@headers ||= row if @use_headers
@lineno += 1
- row = @fields_converter.convert(row, nil, lineno) if @fields_converter
+ if @fields_converter
+ quoted_fields = [false] * row.size
+ row = @fields_converter.convert(row, nil, lineno, quoted_fields)
+ end
i = -1
converted_row = row.collect do |field|
@@ -97,7 +97,7 @@ class CSV
return unless @headers
converter = @options[:header_fields_converter]
- @headers = converter.convert(@headers, nil, 0)
+ @headers = converter.convert(@headers, nil, 0, [])
@headers.each do |header|
header.freeze if header.is_a?(String)
end
diff --git a/profile/parse.rb b/profile/parse.rb
new file mode 100755
index 0000000..3b67b34
--- /dev/null
+++ b/profile/parse.rb
@@ -0,0 +1,49 @@
+#!/usr/bin/env ruby
+
+require "csv"
+require "optparse"
+
+n_columns = 1000
+n_rows = 1000
+type = "unquoted"
+
+alphas = nil
+hiraganas = nil
+
+builders = {
+ "unquoted" => lambda {(alphas.join(",") + "\r\n") * n_rows},
+ "quoted" => lambda {(alphas.map {|s| %("#{s}")}.join(",") + "\r\n") * n_rows},
+ "include-column-separator" =>
+ lambda {(alphas.map {|s| %(",#{s}")}.join(",") + "\r\n") * n_rows},
+ "include-row-separator" =>
+ lambda {(alphas.map {|s| %("#{s}\r\n")}.join(",") + "\r\n") * n_rows},
+ "utf-8" => lambda {((hiraganas.join(",") + "\r\n") * n_rows).encode("UTF-8")},
+ "windows-31j" =>
+ lambda {((hiraganas.join(",") + "\r\n") * n_rows).encode("Windows-31J")},
+}
+
+parser = OptionParser.new
+parser.on("--n-columns=N", Integer,
+ "The number of columns to be parsed",
+ "(#{n_columns})") do |n|
+ n_columns = n
+end
+parser.on("--n-rows=N", Integer,
+ "The number of rows to be parsed",
+ "(#{n_rows})") do |n|
+ n_rows = n
+end
+parser.on("--type=TYPE", builders.keys,
+ "The type for profile",
+ "(#{type})") do |t|
+ type = t
+end
+parser.parse!(ARGV)
+
+alphas = ["AAAAA"] * n_columns
+hiragans = ["あああああ"] * n_columns
+
+data = builders[type].call
+
+require "profile"
+CSV.parse(data)
diff --git a/profile/write.rb b/profile/write.rb
new file mode 100755
index 0000000..f5177c0
--- /dev/null
+++ b/profile/write.rb
@@ -0,0 +1,53 @@
+#!/usr/bin/env ruby
+
+require "csv"
+require "optparse"
+
+n_columns = 5
+n_rows = 100
+type = "generate-line"
+
+parser = OptionParser.new
+parser.on("--n-columns=N", Integer,
+ "The number of columns to be generated",
+ "(#{n_columns})") do |n|
+ n_columns = n
+end
+parser.on("--n-rows=N", Integer,
+ "The number of rows to be generated",
+ "(#{n_rows})") do |n|
+ n_rows = n
+end
+parser.on("--type=TYPE",
+ "The type to write",
+ "(#{type})") do |t|
+ type = t
+end
+parser.parse!(ARGV)
+
+fields = ["AAAAA"] * n_columns
+headers = n_columns.times.collect do |i|
+ "header#{i}"
+end
+row = CSV::Row.new(headers, fields)
+raw_row = {}
+n_columns.times do |i|
+ raw_row[headers[i]] = fields[i]
+end
+
+require "profile"
+
+case type
+when "generate-line"
+ n_rows.times do
+ CSV.generate_line(fields)
+ end
+when "add"
+ output = StringIO.new
+ csv = CSV.new(output)
+ n_rows.times do
+ csv << row
+ end
+else
+ raise "unknown type: #{type.inspect}"
+end
diff --git a/run-test.rb b/run-test.rb
new file mode 100755
index 0000000..8c2641d
--- /dev/null
+++ b/run-test.rb
@@ -0,0 +1,14 @@
+#!/usr/bin/env ruby
+
+$VERBOSE = true
+
+$LOAD_PATH.unshift("test")
+$LOAD_PATH.unshift("test/lib")
+$LOAD_PATH.unshift("lib")
+
+Dir.glob("test/csv/**/*test_*.rb") do |test_rb|
+ # Ensure we only load syntax that we can handle
+ next if RUBY_VERSION < "2.7" && test_rb.end_with?("test_patterns.rb")
+
+ require File.expand_path(test_rb)
+end
diff --git a/test/csv/helper.rb b/test/csv/helper.rb
new file mode 100644
index 0000000..1f9cf96
--- /dev/null
+++ b/test/csv/helper.rb
@@ -0,0 +1,42 @@
+require "tempfile"
+require "test/unit"
+
+require "csv"
+
+require_relative "../lib/with_different_ofs"
+
+module Helper
+ def with_chunk_size(chunk_size)
+ chunk_size_keep = ENV["CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"]
+ begin
+ ENV["CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"] = chunk_size
+ yield
+ ensure
+ ENV["CSV_PARSER_SCANNER_TEST_CHUNK_SIZE"] = chunk_size_keep
+ end
+ end
+
+ def with_verbose(verbose)
+ original = $VERBOSE
+ begin
+ $VERBOSE = verbose
+ yield
+ ensure
+ $VERBOSE = original
+ end
+ end
+
+ def with_default_internal(encoding)
+ original = Encoding.default_internal
+ begin
+ with_verbose(false) do
+ Encoding.default_internal = encoding
+ end
+ yield
+ ensure
+ with_verbose(false) do
+ Encoding.default_internal = original
+ end
+ end
+ end
+end
diff --git a/test/csv/interface/test_delegation.rb b/test/csv/interface/test_delegation.rb
new file mode 100644
index 0000000..3492576
--- /dev/null
+++ b/test/csv/interface/test_delegation.rb
@@ -0,0 +1,47 @@
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVInterfaceDelegation < Test::Unit::TestCase
+ class TestStringIO < self
+ def setup
+ @csv = CSV.new("h1,h2")
+ end
+
+ def test_flock
+ assert_raise(NotImplementedError) do
+ @csv.flock(File::LOCK_EX)
+ end
+ end
+
+ def test_ioctl
+ assert_raise(NotImplementedError) do
+ @csv.ioctl(0)
+ end
+ end
+
+ def test_stat
+ assert_raise(NotImplementedError) do
+ @csv.stat
+ end
+ end
+
+ def test_to_i
+ assert_raise(NotImplementedError) do
+ @csv.to_i
+ end
+ end
+
+ def test_binmode?
+ assert_equal(false, @csv.binmode?)
+ end
+
+ def test_path
+ assert_equal(nil, @csv.path)
+ end
+
+ def test_to_io
+ assert_instance_of(StringIO, @csv.to_io)
+ end
+ end
+end
diff --git a/test/csv/interface/test_read.rb b/test/csv/interface/test_read.rb
new file mode 100644
index 0000000..0011770
--- /dev/null
+++ b/test/csv/interface/test_read.rb
@@ -0,0 +1,381 @@
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVInterfaceRead < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @data = ""
+ @data << "1\t2\t3\r\n"
+ @data << "4\t5\r\n"
+ @input = Tempfile.new(["interface-read", ".csv"], binmode: true)
+ @input << @data
+ @input.rewind
+ @rows = [
+ ["1", "2", "3"],
+ ["4", "5"],
+ ]
+ end
+
+ def teardown
+ @input.close(true)
+ super
+ end
+
+ def test_foreach
+ rows = []
+ CSV.foreach(@input.path, col_sep: "\t", row_sep: "\r\n") do |row|
+ rows << row
+ end
+ assert_equal(@rows, rows)
+ end
+
+ if respond_to?(:ractor)
+ ractor
+ def test_foreach_in_ractor
+ ractor = Ractor.new(@input.path) do |path|
+ rows = []
+ CSV.foreach(path, col_sep: "\t", row_sep: "\r\n") do |row|
+ rows << row
+ end
+ rows
+ end
+ rows = [
+ ["1", "2", "3"],
+ ["4", "5"],
+ ]
+ assert_equal(rows, ractor.take)
+ end
+ end
+
+ def test_foreach_mode
+ rows = []
+ CSV.foreach(@input.path, "r", col_sep: "\t", row_sep: "\r\n") do |row|
+ rows << row
+ end
+ assert_equal(@rows, rows)
+ end
+
+ def test_foreach_enumerator
+ rows = CSV.foreach(@input.path, col_sep: "\t", row_sep: "\r\n").to_a
+ assert_equal(@rows, rows)
+ end
+
+ def test_closed?
+ csv = CSV.open(@input.path, "r+", col_sep: "\t", row_sep: "\r\n")
+ assert_not_predicate(csv, :closed?)
+ csv.close
+ assert_predicate(csv, :closed?)
+ end
+
+ def test_open_auto_close
+ csv = nil
+ CSV.open(@input.path) do |_csv|
+ csv = _csv
+ end
+ assert_predicate(csv, :closed?)
+ end
+
+ def test_open_closed
+ csv = nil
+ CSV.open(@input.path) do |_csv|
+ csv = _csv
+ csv.close
+ end
+ assert_predicate(csv, :closed?)
+ end
+
+ def test_open_block_return_value
+ return_value = CSV.open(@input.path) do
+ "Return value."
+ end
+ assert_equal("Return value.", return_value)
+ end
+
+ def test_open_encoding_valid
+ # U+1F600 GRINNING FACE
+ # U+1F601 GRINNING FACE WITH SMILING EYES
+ File.open(@input.path, "w") do |file|
+ file << "\u{1F600},\u{1F601}"
+ end
+ CSV.open(@input.path, encoding: "utf-8") do |csv|
+ assert_equal([["\u{1F600}", "\u{1F601}"]],
+ csv.to_a)
+ end
+ end
+
+ def test_open_encoding_invalid
+ # U+1F600 GRINNING FACE
+ # U+1F601 GRINNING FACE WITH SMILING EYES
+ File.open(@input.path, "w") do |file|
+ file << "\u{1F600},\u{1F601}"
+ end
+ CSV.open(@input.path, encoding: "EUC-JP") do |csv|
+ error = assert_raise(CSV::MalformedCSVError) do
+ csv.shift
+ end
+ assert_equal("Invalid byte sequence in EUC-JP in line 1.",
+ error.message)
+ end
+ end
+
+ def test_open_encoding_nonexistent
+ _output, error = capture_output do
+ CSV.open(@input.path, encoding: "nonexistent") do
+ end
+ end
+ assert_equal("path:0: warning: Unsupported encoding nonexistent ignored\n",
+ error.gsub(/\A.+:\d+: /, "path:0: "))
+ end
+
+ def test_open_encoding_utf_8_with_bom
+ # U+FEFF ZERO WIDTH NO-BREAK SPACE, BOM
+ # U+1F600 GRINNING FACE
+ # U+1F601 GRINNING FACE WITH SMILING EYES
+ File.open(@input.path, "w") do |file|
+ file << "\u{FEFF}\u{1F600},\u{1F601}"
+ end
+ CSV.open(@input.path, encoding: "bom|utf-8") do |csv|
+ assert_equal([["\u{1F600}", "\u{1F601}"]],
+ csv.to_a)
+ end
+ end
+
+ def test_open_invalid_byte_sequence_in_utf_8
+ CSV.open(@input.path, "w", encoding: Encoding::CP932) do |rows|
+ error = assert_raise(Encoding::InvalidByteSequenceError) do
+ rows << ["\x82\xa0"]
+ end
+ assert_equal('"\x82" on UTF-8',
+ error.message)
+ end
+ end
+
+ def test_open_with_invalid_nil
+ CSV.open(@input.path, "w", encoding: Encoding::CP932, invalid: nil) do |rows|
+ error = assert_raise(Encoding::InvalidByteSequenceError) do
+ rows << ["\x82\xa0"]
+ end
+ assert_equal('"\x82" on UTF-8',
+ error.message)
+ end
+ end
+
+ def test_open_with_invalid_replace
+ CSV.open(@input.path, "w", encoding: Encoding::CP932, invalid: :replace) do |rows|
+ rows << ["\x82\xa0".force_encoding(Encoding::UTF_8)]
+ end
+ CSV.open(@input.path, encoding: Encoding::CP932) do |csv|
+ assert_equal([["??"]],
+ csv.to_a)
+ end
+ end
+
+ def test_open_with_invalid_replace_and_replace_string
+ CSV.open(@input.path, "w", encoding: Encoding::CP932, invalid: :replace, replace: "X") do |rows|
+ rows << ["\x82\xa0".force_encoding(Encoding::UTF_8)]
+ end
+ CSV.open(@input.path, encoding: Encoding::CP932) do |csv|
+ assert_equal([["XX"]],
+ csv.to_a)
+ end
+ end
+
+ def test_open_with_undef_replace
+ # U+00B7 Middle Dot
+ CSV.open(@input.path, "w", encoding: Encoding::CP932, undef: :replace) do |rows|
+ rows << ["\u00B7"]
+ end
+ CSV.open(@input.path, encoding: Encoding::CP932) do |csv|
+ assert_equal([["?"]],
+ csv.to_a)
+ end
+ end
+
+ def test_open_with_undef_replace_and_replace_string
+ # U+00B7 Middle Dot
+ CSV.open(@input.path, "w", encoding: Encoding::CP932, undef: :replace, replace: "X") do |rows|
+ rows << ["\u00B7"]
+ end
+ CSV.open(@input.path, encoding: Encoding::CP932) do |csv|
+ assert_equal([["X"]],
+ csv.to_a)
+ end
+ end
+
+ def test_open_with_newline
+ CSV.open(@input.path, col_sep: "\t", universal_newline: true) do |csv|
+ assert_equal(@rows, csv.to_a)
+ end
+ File.binwrite(@input.path, "1,2,3\r\n" "4,5\n")
+ CSV.open(@input.path, newline: :universal) do |csv|
+ assert_equal(@rows, csv.to_a)
+ end
+ end
+
+ def test_parse
+ assert_equal(@rows,
+ CSV.parse(@data, col_sep: "\t", row_sep: "\r\n"))
+ end
+
+ def test_parse_block
+ rows = []
+ CSV.parse(@data, col_sep: "\t", row_sep: "\r\n") do |row|
+ rows << row
+ end
+ assert_equal(@rows, rows)
+ end
+
+ def test_parse_enumerator
+ rows = CSV.parse(@data, col_sep: "\t", row_sep: "\r\n").to_a
+ assert_equal(@rows, rows)
+ end
+
+ def test_parse_headers_only
+ table = CSV.parse("a,b,c", headers: true)
+ assert_equal([
+ ["a", "b", "c"],
+ [],
+ ],
+ [
+ table.headers,
+ table.each.to_a,
+ ])
+ end
+
+ def test_parse_line
+ assert_equal(["1", "2", "3"],
+ CSV.parse_line("1;2;3", col_sep: ";"))
+ end
+
+ def test_parse_line_shortcut
+ assert_equal(["1", "2", "3"],
+ "1;2;3".parse_csv(col_sep: ";"))
+ end
+
+ def test_parse_line_empty
+ assert_equal(nil, CSV.parse_line("")) # to signal eof
+ end
+
+ def test_parse_line_empty_line
+ assert_equal([], CSV.parse_line("\n1,2,3"))
+ end
+
+ def test_read
+ assert_equal(@rows,
+ CSV.read(@input.path, col_sep: "\t", row_sep: "\r\n"))
+ end
+
+ if respond_to?(:ractor)
+ ractor
+ def test_read_in_ractor
+ ractor = Ractor.new(@input.path) do |path|
+ CSV.read(path, col_sep: "\t", row_sep: "\r\n")
+ end
+ rows = [
+ ["1", "2", "3"],
+ ["4", "5"],
+ ]
+ assert_equal(rows, ractor.take)
+ end
+ end
+
+ def test_readlines
+ assert_equal(@rows,
+ CSV.readlines(@input.path, col_sep: "\t", row_sep: "\r\n"))
+ end
+
+ def test_open_read
+ rows = CSV.open(@input.path, col_sep: "\t", row_sep: "\r\n") do |csv|
+ csv.read
+ end
+ assert_equal(@rows, rows)
+ end
+
+ def test_open_readlines
+ rows = CSV.open(@input.path, col_sep: "\t", row_sep: "\r\n") do |csv|
+ csv.readlines
+ end
+ assert_equal(@rows, rows)
+ end
+
+ def test_table
+ table = CSV.table(@input.path, col_sep: "\t", row_sep: "\r\n")
+ assert_equal(CSV::Table.new([
+ CSV::Row.new([:"1", :"2", :"3"], [4, 5, nil]),
+ ]),
+ table)
+ end
+
+ def test_shift # aliased as gets() and readline()
+ CSV.open(@input.path, "rb+", col_sep: "\t", row_sep: "\r\n") do |csv|
+ rows = [
+ csv.shift,
+ csv.shift,
+ csv.shift,
+ ]
+ assert_equal(@rows + [nil],
+ rows)
+ end
+ end
+
+ def test_enumerator
+ CSV.open(@input.path, col_sep: "\t", row_sep: "\r\n") do |csv|
+ assert_equal(@rows, csv.each.to_a)
+ end
+ end
+
+ def test_shift_and_each
+ CSV.open(@input.path, col_sep: "\t", row_sep: "\r\n") do |csv|
+ rows = []
+ rows << csv.shift
+ rows.concat(csv.each.to_a)
+ assert_equal(@rows, rows)
+ end
+ end
+
+ def test_each_twice
+ CSV.open(@input.path, col_sep: "\t", row_sep: "\r\n") do |csv|
+ assert_equal([
+ @rows,
+ [],
+ ],
+ [
+ csv.each.to_a,
+ csv.each.to_a,
+ ])
+ end
+ end
+
+ def test_eof?
+ eofs = []
+ CSV.open(@input.path, col_sep: "\t", row_sep: "\r\n") do |csv|
+ eofs << csv.eof?
+ csv.shift
+ eofs << csv.eof?
+ csv.shift
+ eofs << csv.eof?
+ end
+ assert_equal([false, false, true],
+ eofs)
+ end
+
+ def test_new_nil
+ assert_raise_with_message ArgumentError, "Cannot parse nil as CSV" do
+ CSV.new(nil)
+ end
+ end
+
+ def test_options_not_modified
+ options = {}.freeze
+ CSV.foreach(@input.path, **options)
+ CSV.open(@input.path, **options) {}
+ CSV.parse("", **options)
+ CSV.parse_line("", **options)
+ CSV.read(@input.path, **options)
+ CSV.readlines(@input.path, **options)
+ CSV.table(@input.path, **options)
+ end
+end
diff --git a/test/csv/interface/test_read_write.rb b/test/csv/interface/test_read_write.rb
new file mode 100644
index 0000000..c371e9c
--- /dev/null
+++ b/test/csv/interface/test_read_write.rb
@@ -0,0 +1,124 @@
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVInterfaceReadWrite < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_filter
+ input = <<-CSV.freeze
+1;2;3
+4;5
+ CSV
+ output = ""
+ CSV.filter(input, output,
+ in_col_sep: ";",
+ out_col_sep: ",",
+ converters: :all) do |row|
+ row.map! {|n| n * 2}
+ row << "Added\r"
+ end
+ assert_equal(<<-CSV, output)
+2,4,6,"Added\r"
+8,10,"Added\r"
+ CSV
+ end
+
+ def test_filter_headers_true
+ input = <<-CSV.freeze
+Name,Value
+foo,0
+bar,1
+baz,2
+ CSV
+ output = ""
+ CSV.filter(input, output, headers: true) do |row|
+ row[0] += "X"
+ row[1] = row[1].to_i + 1
+ end
+ assert_equal(<<-CSV, output)
+fooX,1
+barX,2
+bazX,3
+ CSV
+ end
+
+ def test_filter_headers_true_write_headers
+ input = <<-CSV.freeze
+Name,Value
+foo,0
+bar,1
+baz,2
+ CSV
+ output = ""
+ CSV.filter(input, output, headers: true, out_write_headers: true) do |row|
+ if row.is_a?(Array)
+ row[0] += "X"
+ row[1] += "Y"
+ else
+ row[0] += "X"
+ row[1] = row[1].to_i + 1
+ end
+ end
+ assert_equal(<<-CSV, output)
+NameX,ValueY
+fooX,1
+barX,2
+bazX,3
+ CSV
+ end
+
+ def test_filter_headers_array_write_headers
+ input = <<-CSV.freeze
+foo,0
+bar,1
+baz,2
+ CSV
+ output = ""
+ CSV.filter(input, output,
+ headers: ["Name", "Value"],
+ out_write_headers: true) do |row|
+ row[0] += "X"
+ row[1] = row[1].to_i + 1
+ end
+ assert_equal(<<-CSV, output)
+Name,Value
+fooX,1
+barX,2
+bazX,3
+ CSV
+ end
+
+ def test_instance_same
+ data = ""
+ assert_equal(CSV.instance(data, col_sep: ";").object_id,
+ CSV.instance(data, col_sep: ";").object_id)
+ end
+
+ def test_instance_append
+ output = ""
+ CSV.instance(output, col_sep: ";") << ["a", "b", "c"]
+ assert_equal(<<-CSV, output)
+a;b;c
+ CSV
+ CSV.instance(output, col_sep: ";") << [1, 2, 3]
+ assert_equal(<<-CSV, output)
+a;b;c
+1;2;3
+ CSV
+ end
+
+ def test_instance_shortcut
+ assert_equal(CSV.instance,
+ CSV {|csv| csv})
+ end
+
+ def test_instance_shortcut_with_io
+ io = StringIO.new
+ from_instance = CSV.instance(io, col_sep: ";") { |csv| csv << ["a", "b", "c"] }
+ from_shortcut = CSV(io, col_sep: ";") { |csv| csv << ["e", "f", "g"] }
+
+ assert_equal(from_instance, from_shortcut)
+ assert_equal(from_instance.string, "a;b;c\ne;f;g\n")
+ end
+end
diff --git a/test/csv/interface/test_write.rb b/test/csv/interface/test_write.rb
new file mode 100644
index 0000000..0cd39a7
--- /dev/null
+++ b/test/csv/interface/test_write.rb
@@ -0,0 +1,217 @@
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVInterfaceWrite < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @output = Tempfile.new(["interface-write", ".csv"])
+ end
+
+ def teardown
+ @output.close(true)
+ super
+ end
+
+ def test_generate_default
+ csv_text = CSV.generate do |csv|
+ csv << [1, 2, 3] << [4, nil, 5]
+ end
+ assert_equal(<<-CSV, csv_text)
+1,2,3
+4,,5
+ CSV
+ end
+
+ if respond_to?(:ractor)
+ ractor
+ def test_generate_default_in_ractor
+ ractor = Ractor.new do
+ CSV.generate do |csv|
+ csv << [1, 2, 3] << [4, nil, 5]
+ end
+ end
+ assert_equal(<<-CSV, ractor.take)
+1,2,3
+4,,5
+ CSV
+ end
+ end
+
+ def test_generate_append
+ csv_text = <<-CSV
+1,2,3
+4,,5
+ CSV
+ CSV.generate(csv_text) do |csv|
+ csv << ["last", %Q{"row"}]
+ end
+ assert_equal(<<-CSV, csv_text)
+1,2,3
+4,,5
+last,"""row"""
+ CSV
+ end
+
+ def test_generate_no_new_line
+ csv_text = CSV.generate("test") do |csv|
+ csv << ["row"]
+ end
+ assert_equal(<<-CSV, csv_text)
+testrow
+ CSV
+ end
+
+ def test_generate_line_col_sep
+ line = CSV.generate_line(["1", "2", "3"], col_sep: ";")
+ assert_equal(<<-LINE, line)
+1;2;3
+ LINE
+ end
+
+ def test_generate_line_row_sep
+ line = CSV.generate_line(["1", "2"], row_sep: nil)
+ assert_equal(<<-LINE.chomp, line)
+1,2
+ LINE
+ end
+
+ def test_generate_line_shortcut
+ line = ["1", "2", "3"].to_csv(col_sep: ";")
+ assert_equal(<<-LINE, line)
+1;2;3
+ LINE
+ end
+
+ def test_generate_lines
+ lines = CSV.generate_lines([["foo", "bar"], [1, 2], [3, 4]])
+ assert_equal(<<-LINES, lines)
+foo,bar
+1,2
+3,4
+ LINES
+ end
+
+ def test_headers_detection
+ headers = ["a", "b", "c"]
+ CSV.open(@output.path, "w", headers: true) do |csv|
+ csv << headers
+ csv << ["1", "2", "3"]
+ assert_equal(headers, csv.headers)
+ end
+ end
+
+ def test_lineno
+ CSV.open(@output.path, "w") do |csv|
+ n_lines = 20
+ n_lines.times do
+ csv << ["a", "b", "c"]
+ end
+ assert_equal(n_lines, csv.lineno)
+ end
+ end
+
+ def test_append_row
+ CSV.open(@output.path, "wb") do |csv|
+ csv <<
+ CSV::Row.new([], ["1", "2", "3"]) <<
+ CSV::Row.new([], ["a", "b", "c"])
+ end
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+1,2,3
+a,b,c
+ CSV
+ end
+
+
+ if respond_to?(:ractor)
+ ractor
+ def test_append_row_in_ractor
+ ractor = Ractor.new(@output.path) do |path|
+ CSV.open(path, "wb") do |csv|
+ csv <<
+ CSV::Row.new([], ["1", "2", "3"]) <<
+ CSV::Row.new([], ["a", "b", "c"])
+ end
+ end
+ ractor.take
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+1,2,3
+a,b,c
+ CSV
+ end
+ end
+
+ def test_append_hash
+ CSV.open(@output.path, "wb", headers: true) do |csv|
+ csv << [:a, :b, :c]
+ csv << {a: 1, b: 2, c: 3}
+ csv << {a: 4, b: 5, c: 6}
+ end
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+a,b,c
+1,2,3
+4,5,6
+ CSV
+ end
+
+ def test_append_hash_headers_array
+ CSV.open(@output.path, "wb", headers: [:b, :a, :c]) do |csv|
+ csv << {a: 1, b: 2, c: 3}
+ csv << {a: 4, b: 5, c: 6}
+ end
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+2,1,3
+5,4,6
+ CSV
+ end
+
+ def test_append_hash_headers_string
+ CSV.open(@output.path, "wb", headers: "b|a|c", col_sep: "|") do |csv|
+ csv << {"a" => 1, "b" => 2, "c" => 3}
+ csv << {"a" => 4, "b" => 5, "c" => 6}
+ end
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+2|1|3
+5|4|6
+ CSV
+ end
+
+ def test_write_headers
+ CSV.open(@output.path,
+ "wb",
+ headers: "b|a|c",
+ write_headers: true,
+ col_sep: "|" ) do |csv|
+ csv << {"a" => 1, "b" => 2, "c" => 3}
+ csv << {"a" => 4, "b" => 5, "c" => 6}
+ end
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+b|a|c
+2|1|3
+5|4|6
+ CSV
+ end
+
+ def test_write_headers_empty
+ CSV.open(@output.path,
+ "wb",
+ headers: "b|a|c",
+ write_headers: true,
+ col_sep: "|" ) do |csv|
+ end
+ assert_equal(<<-CSV, File.read(@output.path, mode: "rb"))
+b|a|c
+ CSV
+ end
+
+ def test_options_not_modified
+ options = {}.freeze
+ CSV.generate(**options) {}
+ CSV.generate_line([], **options)
+ CSV.filter("", "", **options)
+ CSV.instance("", **options)
+ end
+end
diff --git a/test/csv/line_endings.gz b/test/csv/line_endings.gz
new file mode 100644
index 0000000..39e1729
Binary files /dev/null and b/test/csv/line_endings.gz differ
diff --git a/test/csv/parse/test_column_separator.rb b/test/csv/parse/test_column_separator.rb
new file mode 100644
index 0000000..d6eaa7b
--- /dev/null
+++ b/test/csv/parse/test_column_separator.rb
@@ -0,0 +1,40 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseColumnSeparator < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_comma
+ assert_equal([["a", "b", nil, "d"]],
+ CSV.parse("a,b,,d", col_sep: ","))
+ end
+
+ def test_space
+ assert_equal([["a", "b", nil, "d"]],
+ CSV.parse("a b d", col_sep: " "))
+ end
+
+ def test_tab
+ assert_equal([["a", "b", nil, "d"]],
+ CSV.parse("a\tb\t\td", col_sep: "\t"))
+ end
+
+ def test_multiple_characters_include_sub_separator
+ assert_equal([["a b", nil, "d"]],
+ CSV.parse("a b d", col_sep: " "))
+ end
+
+ def test_multiple_characters_leading_empty_fields
+ data = <<-CSV
+<=><=>A<=>B<=>C
+1<=>2<=>3
+ CSV
+ assert_equal([
+ [nil, nil, "A", "B", "C"],
+ ["1", "2", "3"],
+ ],
+ CSV.parse(data, col_sep: "<=>"))
+ end
+end
diff --git a/test/csv/parse/test_convert.rb b/test/csv/parse/test_convert.rb
new file mode 100644
index 0000000..c9195c7
--- /dev/null
+++ b/test/csv/parse/test_convert.rb
@@ -0,0 +1,165 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseConvert < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @data = "Numbers,:integer,1,:float,3.015"
+ @parser = CSV.new(@data)
+
+ @custom = lambda {|field| /\A:(\S.*?)\s*\Z/ =~ field ? $1.to_sym : field}
+
+ @time = Time.utc(2018, 12, 30, 6, 41, 29)
+ @windows_safe_time_data = @time.strftime("%a %b %d %H:%M:%S %Y")
+
+ @preserving_converter = lambda do |field, info|
+ f = field.encode(CSV::ConverterEncoding)
+ return f if info.quoted?
+ begin
+ Integer(f, 10)
+ rescue
+ f
+ end
+ end
+
+ @quoted_header_converter = lambda do |field, info|
+ f = field.encode(CSV::ConverterEncoding)
+ return f if info.quoted?
+ f.to_sym
+ end
+ end
+
+ def test_integer
+ @parser.convert(:integer)
+ assert_equal(["Numbers", ":integer", 1, ":float", "3.015"],
+ @parser.shift)
+ end
+
+ def test_float
+ @parser.convert(:float)
+ assert_equal(["Numbers", ":integer", 1.0, ":float", 3.015],
+ @parser.shift)
+ end
+
+ def test_float_integer
+ @parser.convert(:float)
+ @parser.convert(:integer)
+ assert_equal(["Numbers", ":integer", 1.0, ":float", 3.015],
+ @parser.shift)
+ end
+
+ def test_integer_float
+ @parser.convert(:integer)
+ @parser.convert(:float)
+ assert_equal(["Numbers", ":integer", 1, ":float", 3.015],
+ @parser.shift)
+ end
+
+ def test_numeric
+ @parser.convert(:numeric)
+ assert_equal(["Numbers", ":integer", 1, ":float", 3.015],
+ @parser.shift)
+ end
+
+ def test_all
+ @data << ",#{@windows_safe_time_data}"
+ @parser = CSV.new(@data)
+ @parser.convert(:all)
+ assert_equal(["Numbers", ":integer", 1, ":float", 3.015, @time.to_datetime],
+ @parser.shift)
+ end
+
+ def test_custom
+ @parser.convert do |field|
+ /\A:(\S.*?)\s*\Z/ =~ field ? $1.to_sym : field
+ end
+ assert_equal(["Numbers", :integer, "1", :float, "3.015"],
+ @parser.shift)
+ end
+
+ def test_builtin_custom
+ @parser.convert(:numeric)
+ @parser.convert(&@custom)
+ assert_equal(["Numbers", :integer, 1, :float, 3.015],
+ @parser.shift)
+ end
+
+ def test_custom_field_info_line
+ @parser.convert do |field, info|
+ assert_equal(1, info.line)
+ info.index == 4 ? Float(field).floor : field
+ end
+ assert_equal(["Numbers", ":integer", "1", ":float", 3],
+ @parser.shift)
+ end
+
+ def test_custom_field_info_header
+ headers = ["one", "two", "three", "four", "five"]
+ @parser = CSV.new(@data, headers: headers)
+ @parser.convert do |field, info|
+ info.header == "three" ? Integer(field) * 100 : field
+ end
+ assert_equal(CSV::Row.new(headers,
+ ["Numbers", ":integer", 100, ":float", "3.015"]),
+ @parser.shift)
+ end
+
+ def test_custom_blank_field
+ converter = lambda {|field| field.nil?}
+ row = CSV.parse_line('nil,', converters: converter)
+ assert_equal([false, true], row)
+ end
+
+ def test_nil_value
+ assert_equal(["nil", "", "a"],
+ CSV.parse_line(',"",a', nil_value: "nil"))
+ end
+
+ def test_empty_value
+ assert_equal([nil, "empty", "a"],
+ CSV.parse_line(',"",a', empty_value: "empty"))
+ end
+
+ def test_quoted_parse_line
+ row = CSV.parse_line('1,"2",3', converters: @preserving_converter)
+ assert_equal([1, "2", 3], row)
+ end
+
+ def test_quoted_parse
+ expected = [["quoted", "unquoted"], ["109", 1], ["10A", 2]]
+ rows = CSV.parse(<<~CSV, converters: @preserving_converter)
+ "quoted",unquoted
+ "109",1
+ "10A",2
+ CSV
+ assert_equal(expected, rows)
+ end
+
+ def test_quoted_alternating_quote
+ row = CSV.parse_line('"1",2,"3"', converters: @preserving_converter)
+ assert_equal(['1', 2, '3'], row)
+ end
+
+ def test_quoted_parse_headers
+ expected = [["quoted", :unquoted], ["109", "1"], ["10A", "2"]]
+ table = CSV.parse(<<~CSV, headers: true, header_converters: @quoted_header_converter)
+ "quoted",unquoted
+ "109",1
+ "10A",2
+ CSV
+ assert_equal(expected, table.to_a)
+ end
+
+ def test_quoted_parse_with_string_headers
+ expected = [["quoted", :unquoted], %w[109 1], %w[10A 2]]
+ table = CSV.parse(<<~CSV, headers: '"quoted",unquoted', header_converters: @quoted_header_converter)
+ "109",1
+ "10A",2
+ CSV
+ assert_equal(expected, table.to_a)
+ end
+end
diff --git a/test/csv/parse/test_each.rb b/test/csv/parse/test_each.rb
new file mode 100644
index 0000000..ce0b71d
--- /dev/null
+++ b/test/csv/parse/test_each.rb
@@ -0,0 +1,23 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseEach < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_twice
+ data = <<-CSV
+Ruby,2.6.0,script
+ CSV
+ csv = CSV.new(data)
+ assert_equal([
+ [["Ruby", "2.6.0", "script"]],
+ [],
+ ],
+ [
+ csv.to_a,
+ csv.to_a,
+ ])
+ end
+end
diff --git a/test/csv/parse/test_general.rb b/test/csv/parse/test_general.rb
new file mode 100644
index 0000000..ff32eef
--- /dev/null
+++ b/test/csv/parse/test_general.rb
@@ -0,0 +1,341 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require "timeout"
+
+require_relative "../helper"
+
+#
+# Following tests are my interpretation of the
+# {CSV RCF}[http://www.ietf.org/rfc/rfc4180.txt]. I only deviate from that
+# document in one place (intentionally) and that is to make the default row
+# separator <tt>$/</tt>.
+#
+class TestCSVParseGeneral < Test::Unit::TestCase
+ extend DifferentOFS
+
+ BIG_DATA = "123456789\n" * 512
+
+ def test_mastering_regex_example
+ ex = %Q{Ten Thousand,10000, 2710 ,,"10,000","It's ""10 Grand"", baby",10K}
+ assert_equal( [ "Ten Thousand", "10000", " 2710 ", nil, "10,000",
+ "It's \"10 Grand\", baby", "10K" ],
+ CSV.parse_line(ex) )
+ end
+
+ # Old Ruby 1.8 CSV library tests.
+ def test_std_lib_csv
+ [ ["\t", ["\t"]],
+ ["foo,\"\"\"\"\"\",baz", ["foo", "\"\"", "baz"]],
+ ["foo,\"\"\"bar\"\"\",baz", ["foo", "\"bar\"", "baz"]],
+ ["\"\"\"\n\",\"\"\"\n\"", ["\"\n", "\"\n"]],
+ ["foo,\"\r\n\",baz", ["foo", "\r\n", "baz"]],
+ ["\"\"", [""]],
+ ["foo,\"\"\"\",baz", ["foo", "\"", "baz"]],
+ ["foo,\"\r.\n\",baz", ["foo", "\r.\n", "baz"]],
+ ["foo,\"\r\",baz", ["foo", "\r", "baz"]],
+ ["foo,\"\",baz", ["foo", "", "baz"]],
+ ["\",\"", [","]],
+ ["foo", ["foo"]],
+ [",,", [nil, nil, nil]],
+ [",", [nil, nil]],
+ ["foo,\"\n\",baz", ["foo", "\n", "baz"]],
+ ["foo,,baz", ["foo", nil, "baz"]],
+ ["\"\"\"\r\",\"\"\"\r\"", ["\"\r", "\"\r"]],
+ ["\",\",\",\"", [",", ","]],
+ ["foo,bar,", ["foo", "bar", nil]],
+ [",foo,bar", [nil, "foo", "bar"]],
+ ["foo,bar", ["foo", "bar"]],
+ [";", [";"]],
+ ["\t,\t", ["\t", "\t"]],
+ ["foo,\"\r\n\r\",baz", ["foo", "\r\n\r", "baz"]],
+ ["foo,\"\r\n\n\",baz", ["foo", "\r\n\n", "baz"]],
+ ["foo,\"foo,bar\",baz", ["foo", "foo,bar", "baz"]],
+ [";,;", [";", ";"]] ].each do |csv_test|
+ assert_equal(csv_test.last, CSV.parse_line(csv_test.first))
+ end
+
+ [ ["foo,\"\"\"\"\"\",baz", ["foo", "\"\"", "baz"]],
+ ["foo,\"\"\"bar\"\"\",baz", ["foo", "\"bar\"", "baz"]],
+ ["foo,\"\r\n\",baz", ["foo", "\r\n", "baz"]],
+ ["\"\"", [""]],
+ ["foo,\"\"\"\",baz", ["foo", "\"", "baz"]],
+ ["foo,\"\r.\n\",baz", ["foo", "\r.\n", "baz"]],
+ ["foo,\"\r\",baz", ["foo", "\r", "baz"]],
+ ["foo,\"\",baz", ["foo", "", "baz"]],
+ ["foo", ["foo"]],
+ [",,", [nil, nil, nil]],
+ [",", [nil, nil]],
+ ["foo,\"\n\",baz", ["foo", "\n", "baz"]],
+ ["foo,,baz", ["foo", nil, "baz"]],
+ ["foo,bar", ["foo", "bar"]],
+ ["foo,\"\r\n\n\",baz", ["foo", "\r\n\n", "baz"]],
+ ["foo,\"foo,bar\",baz", ["foo", "foo,bar", "baz"]] ].each do |csv_test|
+ assert_equal(csv_test.last, CSV.parse_line(csv_test.first))
+ end
+ end
+
+ # From: http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/6496
+ def test_aras_edge_cases
+ [ [%Q{a,b}, ["a", "b"]],
+ [%Q{a,"""b"""}, ["a", "\"b\""]],
+ [%Q{a,"""b"}, ["a", "\"b"]],
+ [%Q{a,"b"""}, ["a", "b\""]],
+ [%Q{a,"\nb"""}, ["a", "\nb\""]],
+ [%Q{a,"""\nb"}, ["a", "\"\nb"]],
+ [%Q{a,"""\nb\n"""}, ["a", "\"\nb\n\""]],
+ [%Q{a,"""\nb\n""",\nc}, ["a", "\"\nb\n\"", nil]],
+ [%Q{a,,,}, ["a", nil, nil, nil]],
+ [%Q{,}, [nil, nil]],
+ [%Q{"",""}, ["", ""]],
+ [%Q{""""}, ["\""]],
+ [%Q{"""",""}, ["\"",""]],
+ [%Q{,""}, [nil,""]],
+ [%Q{,"\r"}, [nil,"\r"]],
+ [%Q{"\r\n,"}, ["\r\n,"]],
+ [%Q{"\r\n,",}, ["\r\n,", nil]] ].each do |edge_case|
+ assert_equal(edge_case.last, CSV.parse_line(edge_case.first))
+ end
+ end
+
+ def test_james_edge_cases
+ # A read at eof? should return nil.
+ assert_equal(nil, CSV.parse_line(""))
+ #
+ # With Ruby 1.8 CSV it's impossible to tell an empty line from a line
+ # containing a single +nil+ field. The old CSV library returns
+ # <tt>[nil]</tt> in these cases, but <tt>Array.new</tt> makes more sense to
+ # me.
+ #
+ assert_equal(Array.new, CSV.parse_line("\n1,2,3\n"))
+ end
+
+ def test_rob_edge_cases
+ [ [%Q{"a\nb"}, ["a\nb"]],
+ [%Q{"\n\n\n"}, ["\n\n\n"]],
+ [%Q{a,"b\n\nc"}, ['a', "b\n\nc"]],
+ [%Q{,"\r\n"}, [nil,"\r\n"]],
+ [%Q{,"\r\n."}, [nil,"\r\n."]],
+ [%Q{"a\na","one newline"}, ["a\na", 'one newline']],
+ [%Q{"a\n\na","two newlines"}, ["a\n\na", 'two newlines']],
+ [%Q{"a\r\na","one CRLF"}, ["a\r\na", 'one CRLF']],
+ [%Q{"a\r\n\r\na","two CRLFs"}, ["a\r\n\r\na", 'two CRLFs']],
+ [%Q{with blank,"start\n\nfinish"\n}, ['with blank', "start\n\nfinish"]],
+ ].each do |edge_case|
+ assert_equal(edge_case.last, CSV.parse_line(edge_case.first))
+ end
+ end
+
+ def test_non_regex_edge_cases
+ # An early version of the non-regex parser fails this test
+ [ [ "foo,\"foo,bar,baz,foo\",\"foo\"",
+ ["foo", "foo,bar,baz,foo", "foo"] ] ].each do |edge_case|
+ assert_equal(edge_case.last, CSV.parse_line(edge_case.first))
+ end
+
+ assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line("1,\"23\"4\"5\", 6")
+ end
+ end
+
+ def test_malformed_csv_cr_first_line
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line("1,2\r,3", row_sep: "\n")
+ end
+ assert_equal("Unquoted fields do not allow new line <\"\\r\"> in line 1.",
+ error.message)
+ end
+
+ def test_malformed_csv_cr_middle_line
+ csv = <<-CSV
+line,1,abc
+line,2,"def\nghi"
+
+line,4,some\rjunk
+line,5,jkl
+ CSV
+
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse(csv)
+ end
+ assert_equal("Unquoted fields do not allow new line <\"\\r\"> in line 4.",
+ error.message)
+ end
+
+ def test_malformed_csv_unclosed_quote
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line('1,2,"3...')
+ end
+ assert_equal("Unclosed quoted field in line 1.",
+ error.message)
+ end
+
+ def test_malformed_csv_illegal_quote_middle_line
+ csv = <<-CSV
+line,1,abc
+line,2,"def\nghi"
+
+line,4,8'10"
+line,5,jkl
+ CSV
+
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse(csv)
+ end
+ assert_equal("Illegal quoting in line 4.",
+ error.message)
+ end
+
+ def test_the_parse_fails_fast_when_it_can_for_unquoted_fields
+ assert_parse_errors_out('valid,fields,bad start"' + BIG_DATA)
+ end
+
+ def test_the_parse_fails_fast_when_it_can_for_unescaped_quotes
+ assert_parse_errors_out('valid,fields,"bad start"unescaped' + BIG_DATA)
+ end
+
+ def test_field_size_limit_controls_lookahead
+ assert_parse_errors_out( 'valid,fields,"' + BIG_DATA + '"',
+ field_size_limit: 2048 )
+ end
+
+ def test_field_size_limit_max_allowed
+ column = "abcde"
+ assert_equal([[column]],
+ CSV.parse("\"#{column}\"",
+ field_size_limit: column.size + 1))
+ end
+
+ def test_field_size_limit_quote_simple
+ column = "abcde"
+ assert_parse_errors_out("\"#{column}\"",
+ field_size_limit: column.size)
+ end
+
+ def test_field_size_limit_no_quote_implicitly
+ column = "abcde"
+ assert_parse_errors_out("#{column}",
+ field_size_limit: column.size)
+ end
+
+ def test_field_size_limit_no_quote_explicitly
+ column = "abcde"
+ assert_parse_errors_out("#{column}",
+ field_size_limit: column.size,
+ quote_char: nil)
+ end
+
+ def test_field_size_limit_in_extended_column_not_exceeding
+ data = <<~DATA
+ "a","b"
+ "
+ 2
+ ",""
+ DATA
+ assert_nothing_raised(CSV::MalformedCSVError) do
+ CSV.parse(data, field_size_limit: 4)
+ end
+ end
+
+ def test_field_size_limit_in_extended_column_exceeding
+ data = <<~DATA
+ "a","b"
+ "
+ 2345
+ ",""
+ DATA
+ assert_parse_errors_out(data, field_size_limit: 5)
+ end
+
+ def test_max_field_size_controls_lookahead
+ assert_parse_errors_out( 'valid,fields,"' + BIG_DATA + '"',
+ max_field_size: 2048 )
+ end
+
+ def test_max_field_size_max_allowed
+ column = "abcde"
+ assert_equal([[column]],
+ CSV.parse("\"#{column}\"",
+ max_field_size: column.size))
+ end
+
+ def test_max_field_size_quote_simple
+ column = "abcde"
+ assert_parse_errors_out("\"#{column}\"",
+ max_field_size: column.size - 1)
+ end
+
+ def test_max_field_size_no_quote_implicitly
+ column = "abcde"
+ assert_parse_errors_out("#{column}",
+ max_field_size: column.size - 1)
+ end
+
+ def test_max_field_size_no_quote_explicitly
+ column = "abcde"
+ assert_parse_errors_out("#{column}",
+ max_field_size: column.size - 1,
+ quote_char: nil)
+ end
+
+ def test_max_field_size_in_extended_column_not_exceeding
+ data = <<~DATA
+ "a","b"
+ "
+ 2
+ ",""
+ DATA
+ assert_nothing_raised(CSV::MalformedCSVError) do
+ CSV.parse(data, max_field_size: 3)
+ end
+ end
+
+ def test_max_field_size_in_extended_column_exceeding
+ data = <<~DATA
+ "a","b"
+ "
+ 2345
+ ",""
+ DATA
+ assert_parse_errors_out(data, max_field_size: 4)
+ end
+
+ def test_row_sep_auto_cr
+ assert_equal([["a"]], CSV.parse("a\r"))
+ end
+
+ def test_row_sep_auto_lf
+ assert_equal([["a"]], CSV.parse("a\n"))
+ end
+
+ def test_row_sep_auto_cr_lf
+ assert_equal([["a"]], CSV.parse("a\r\n"))
+ end
+
+ def test_seeked_string_io
+ input_with_bom = StringIO.new("\ufeffあ,い,う\r\na,b,c\r\n")
+ input_with_bom.read(3)
+ assert_equal([
+ ["あ", "い", "う"],
+ ["a", "b", "c"],
+ ],
+ CSV.new(input_with_bom).each.to_a)
+ end
+
+ private
+ def assert_parse_errors_out(data, **options)
+ assert_raise(CSV::MalformedCSVError) do
+ timeout = 0.2
+ if defined?(RubyVM::YJIT.enabled?) and RubyVM::YJIT.enabled?
+ timeout = 1 # for --yjit-call-threshold=1
+ end
+ if defined?(RubyVM::MJIT.enabled?) and RubyVM::MJIT.enabled?
+ timeout = 5 # for --jit-wait
+ end
+ Timeout.timeout(timeout) do
+ CSV.parse(data, **options)
+ fail("Parse didn't error out")
+ end
+ end
+ end
+end
diff --git a/test/csv/parse/test_header.rb b/test/csv/parse/test_header.rb
new file mode 100644
index 0000000..e8c3786
--- /dev/null
+++ b/test/csv/parse/test_header.rb
@@ -0,0 +1,342 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVHeaders < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @data = <<-CSV
+first,second,third
+A,B,C
+1,2,3
+ CSV
+ end
+
+ def test_first_row
+ [:first_row, true].each do |setting| # two names for the same setting
+ # activate headers
+ csv = nil
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse(@data, headers: setting)
+ end
+
+ # first data row - skipping headers
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{first A}, %w{second B}, %w{third C}], row.to_a)
+
+ # second data row
+ row = csv[1]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{first 1}, %w{second 2}, %w{third 3}], row.to_a)
+
+ # empty
+ assert_nil(csv[2])
+ end
+ end
+
+ def test_array_of_headers
+ # activate headers
+ csv = nil
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse(@data, headers: [:my, :new, :headers])
+ end
+
+ # first data row - skipping headers
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal( [[:my, "first"], [:new, "second"], [:headers, "third"]],
+ row.to_a )
+
+ # second data row
+ row = csv[1]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([[:my, "A"], [:new, "B"], [:headers, "C"]], row.to_a)
+
+ # third data row
+ row = csv[2]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([[:my, "1"], [:new, "2"], [:headers, "3"]], row.to_a)
+
+ # empty
+ assert_nil(csv[3])
+
+ # with return and convert
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse( @data, headers: [:my, :new, :headers],
+ return_headers: true,
+ header_converters: lambda { |h| h.to_s } )
+ end
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([["my", :my], ["new", :new], ["headers", :headers]], row.to_a)
+ assert_predicate(row, :header_row?)
+ assert_not_predicate(row, :field_row?)
+ end
+
+ def test_csv_header_string
+ # activate headers
+ csv = nil
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse(@data, headers: "my,new,headers")
+ end
+
+ # first data row - skipping headers
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{my first}, %w{new second}, %w{headers third}], row.to_a)
+
+ # second data row
+ row = csv[1]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{my A}, %w{new B}, %w{headers C}], row.to_a)
+
+ # third data row
+ row = csv[2]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{my 1}, %w{new 2}, %w{headers 3}], row.to_a)
+
+ # empty
+ assert_nil(csv[3])
+
+ # with return and convert
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse( @data, headers: "my,new,headers",
+ return_headers: true,
+ header_converters: :symbol )
+ end
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([[:my, "my"], [:new, "new"], [:headers, "headers"]], row.to_a)
+ assert_predicate(row, :header_row?)
+ assert_not_predicate(row, :field_row?)
+ end
+
+ def test_csv_header_string_inherits_separators
+ # parse with custom col_sep
+ csv = nil
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse( @data.tr(",", "|"), col_sep: "|",
+ headers: "my|new|headers" )
+ end
+
+ # verify headers were recognized
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{my first}, %w{new second}, %w{headers third}], row.to_a)
+ end
+
+ def test_return_headers
+ # activate headers and request they are returned
+ csv = nil
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse(@data, headers: true, return_headers: true)
+ end
+
+ # header row
+ row = csv[0]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal( [%w{first first}, %w{second second}, %w{third third}],
+ row.to_a )
+ assert_predicate(row, :header_row?)
+ assert_not_predicate(row, :field_row?)
+
+ # first data row - skipping headers
+ row = csv[1]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{first A}, %w{second B}, %w{third C}], row.to_a)
+ assert_not_predicate(row, :header_row?)
+ assert_predicate(row, :field_row?)
+
+ # second data row
+ row = csv[2]
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([%w{first 1}, %w{second 2}, %w{third 3}], row.to_a)
+ assert_not_predicate(row, :header_row?)
+ assert_predicate(row, :field_row?)
+
+ # empty
+ assert_nil(csv[3])
+ end
+
+ def test_converters
+ # create test data where headers and fields look alike
+ data = <<-CSV
+1,2,3
+1,2,3
+ CSV
+
+ # normal converters do not affect headers
+ csv = CSV.parse( data, headers: true,
+ return_headers: true,
+ converters: :numeric )
+ assert_equal([%w{1 1}, %w{2 2}, %w{3 3}], csv[0].to_a)
+ assert_equal([["1", 1], ["2", 2], ["3", 3]], csv[1].to_a)
+ assert_nil(csv[2])
+
+ # header converters do affect headers (only)
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse( data, headers: true,
+ return_headers: true,
+ converters: :numeric,
+ header_converters: :symbol )
+ end
+ assert_equal([[:"1", "1"], [:"2", "2"], [:"3", "3"]], csv[0].to_a)
+ assert_equal([[:"1", 1], [:"2", 2], [:"3", 3]], csv[1].to_a)
+ assert_nil(csv[2])
+ end
+
+ def test_builtin_downcase_converter
+ csv = CSV.parse( "One,TWO Three", headers: true,
+ return_headers: true,
+ header_converters: :downcase )
+ assert_equal(%w{one two\ three}, csv.headers)
+ end
+
+ def test_builtin_symbol_converter
+ # Note that the trailing space is intentional
+ csv = CSV.parse( "One,TWO Three ", headers: true,
+ return_headers: true,
+ header_converters: :symbol )
+ assert_equal([:one, :two_three], csv.headers)
+ end
+
+ def test_builtin_symbol_raw_converter
+ csv = CSV.parse( "a b,c d", headers: true,
+ return_headers: true,
+ header_converters: :symbol_raw )
+ assert_equal([:"a b", :"c d"], csv.headers)
+ end
+
+ def test_builtin_symbol_converter_with_punctuation
+ csv = CSV.parse( "One, Two & Three ($)", headers: true,
+ return_headers: true,
+ header_converters: :symbol )
+ assert_equal([:one, :two_three], csv.headers)
+ end
+
+ def test_builtin_converters_with_blank_header
+ csv = CSV.parse( "one,,three", headers: true,
+ return_headers: true,
+ header_converters: [:downcase, :symbol, :symbol_raw] )
+ assert_equal([:one, nil, :three], csv.headers)
+ end
+
+ def test_custom_converter
+ converter = lambda { |header| header.tr(" ", "_") }
+ csv = CSV.parse( "One,TWO Three",
+ headers: true,
+ return_headers: true,
+ header_converters: converter )
+ assert_equal(%w{One TWO_Three}, csv.headers)
+ end
+
+ def test_table_support
+ csv = nil
+ assert_nothing_raised(Exception) do
+ csv = CSV.parse(@data, headers: true)
+ end
+
+ assert_instance_of(CSV::Table, csv)
+ end
+
+ def test_skip_blanks
+ @data = <<-CSV
+
+
+A,B,C
+
+1,2,3
+
+
+
+ CSV
+
+ expected = [%w[1 2 3]]
+ CSV.parse(@data, headers: true, skip_blanks: true) do |row|
+ assert_equal(expected.shift, row.fields)
+ end
+
+ expected = [%w[A B C], %w[1 2 3]]
+ CSV.parse( @data,
+ headers: true,
+ return_headers: true,
+ skip_blanks: true ) do |row|
+ assert_equal(expected.shift, row.fields)
+ end
+ end
+
+ def test_headers_reader
+ # no headers
+ assert_nil(CSV.new(@data).headers)
+
+ # headers
+ csv = CSV.new(@data, headers: true)
+ assert_equal(true, csv.headers) # before headers are read
+ csv.shift # set headers
+ assert_equal(%w[first second third], csv.headers) # after headers are read
+ end
+
+ def test_blank_row
+ @data += "\n#{@data}" # add a blank row
+
+ # ensure that everything returned is a Row object
+ CSV.parse(@data, headers: true) do |row|
+ assert_instance_of(CSV::Row, row)
+ end
+ end
+
+ def test_nil_row_header
+ @data = <<-CSV
+A
+
+1
+ CSV
+
+ csv = CSV.parse(@data, headers: true)
+
+ # ensure nil row creates Row object with headers
+ row = csv[0]
+ assert_equal([["A"], [nil]],
+ [row.headers, row.fields])
+ end
+
+ def test_parse_empty
+ assert_equal(CSV::Table.new([]),
+ CSV.parse("", headers: true))
+ end
+
+ def test_parse_empty_line
+ assert_equal(CSV::Table.new([]),
+ CSV.parse("\n", headers: true))
+ end
+
+ def test_specified_empty
+ assert_equal(CSV::Table.new([],
+ headers: ["header1"]),
+ CSV.parse("", headers: ["header1"]))
+ end
+
+ def test_specified_empty_line
+ assert_equal(CSV::Table.new([CSV::Row.new(["header1"], [])],
+ headers: ["header1"]),
+ CSV.parse("\n", headers: ["header1"]))
+ end
+end
diff --git a/test/csv/parse/test_inputs_scanner.rb b/test/csv/parse/test_inputs_scanner.rb
new file mode 100644
index 0000000..06e1c84
--- /dev/null
+++ b/test/csv/parse/test_inputs_scanner.rb
@@ -0,0 +1,63 @@
+require_relative "../helper"
+
+class TestCSVParseInputsScanner < Test::Unit::TestCase
+ include Helper
+
+ def test_scan_keep_over_chunks_nested_back
+ input = CSV::Parser::UnoptimizedStringIO.new("abcdefghijklmnl")
+ scanner = CSV::Parser::InputsScanner.new([input],
+ Encoding::UTF_8,
+ nil,
+ chunk_size: 2)
+ scanner.keep_start
+ assert_equal("abc", scanner.scan_all(/[a-c]+/))
+ scanner.keep_start
+ assert_equal("def", scanner.scan_all(/[d-f]+/))
+ scanner.keep_back
+ scanner.keep_back
+ assert_equal("abcdefg", scanner.scan_all(/[a-g]+/))
+ end
+
+ def test_scan_keep_over_chunks_nested_drop_back
+ input = CSV::Parser::UnoptimizedStringIO.new("abcdefghijklmnl")
+ scanner = CSV::Parser::InputsScanner.new([input],
+ Encoding::UTF_8,
+ nil,
+ chunk_size: 3)
+ scanner.keep_start
+ assert_equal("ab", scanner.scan(/../))
+ scanner.keep_start
+ assert_equal("c", scanner.scan(/./))
+ assert_equal("d", scanner.scan(/./))
+ scanner.keep_drop
+ scanner.keep_back
+ assert_equal("abcdefg", scanner.scan_all(/[a-g]+/))
+ end
+
+ def test_each_line_keep_over_chunks_multibyte
+ input = CSV::Parser::UnoptimizedStringIO.new("ab\n\u{3000}a\n")
+ scanner = CSV::Parser::InputsScanner.new([input],
+ Encoding::UTF_8,
+ nil,
+ chunk_size: 1)
+ each_line = scanner.each_line("\n")
+ assert_equal("ab\n", each_line.next)
+ scanner.keep_start
+ assert_equal("\u{3000}a\n", each_line.next)
+ scanner.keep_back
+ assert_equal("\u{3000}a\n", scanner.scan_all(/[^,]+/))
+ end
+
+ def test_each_line_keep_over_chunks_fit_chunk_size
+ input = CSV::Parser::UnoptimizedStringIO.new("\na")
+ scanner = CSV::Parser::InputsScanner.new([input],
+ Encoding::UTF_8,
+ nil,
+ chunk_size: 1)
+ each_line = scanner.each_line("\n")
+ assert_equal("\n", each_line.next)
+ scanner.keep_start
+ assert_equal("a", each_line.next)
+ scanner.keep_back
+ end
+end
diff --git a/test/csv/parse/test_invalid.rb b/test/csv/parse/test_invalid.rb
new file mode 100644
index 0000000..ddb59e2
--- /dev/null
+++ b/test/csv/parse/test_invalid.rb
@@ -0,0 +1,52 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseInvalid < Test::Unit::TestCase
+ def test_no_column_mixed_new_lines
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse("\n" +
+ "\r")
+ end
+ assert_equal("New line must be <\"\\n\"> not <\"\\r\"> in line 2.",
+ error.message)
+ end
+
+ def test_ignore_invalid_line
+ csv = CSV.new(<<-CSV, headers: true, return_headers: true)
+head1,head2,head3
+aaa,bbb,ccc
+ddd,ee"e.fff
+ggg,hhh,iii
+ CSV
+ headers = ["head1", "head2", "head3"]
+ assert_equal(CSV::Row.new(headers, headers),
+ csv.shift)
+ assert_equal(CSV::Row.new(headers, ["aaa", "bbb", "ccc"]),
+ csv.shift)
+ assert_equal(false, csv.eof?)
+ error = assert_raise(CSV::MalformedCSVError) do
+ csv.shift
+ end
+ assert_equal("Illegal quoting in line 3.",
+ error.message)
+ assert_equal(false, csv.eof?)
+ assert_equal(CSV::Row.new(headers, ["ggg", "hhh", "iii"]),
+ csv.shift)
+ assert_equal(true, csv.eof?)
+ end
+
+ def test_ignore_invalid_line_cr_lf
+ data = <<-CSV
+"1","OK"\r
+"2",""NOT" OK"\r
+"3","OK"\r
+CSV
+ csv = CSV.new(data)
+
+ assert_equal(['1', 'OK'], csv.shift)
+ assert_raise(CSV::MalformedCSVError) { csv.shift }
+ assert_equal(['3', 'OK'], csv.shift)
+ end
+end
diff --git a/test/csv/parse/test_liberal_parsing.rb b/test/csv/parse/test_liberal_parsing.rb
new file mode 100644
index 0000000..5796d10
--- /dev/null
+++ b/test/csv/parse/test_liberal_parsing.rb
@@ -0,0 +1,171 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseLiberalParsing < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_middle_quote_start
+ input = '"Johnson, Dwayne",Dwayne "The Rock" Johnson'
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line(input)
+ end
+ assert_equal("Illegal quoting in line 1.",
+ error.message)
+ assert_equal(["Johnson, Dwayne", 'Dwayne "The Rock" Johnson'],
+ CSV.parse_line(input, liberal_parsing: true))
+ end
+
+ def test_middle_quote_end
+ input = '"quoted" field'
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line(input)
+ end
+ assert_equal("Any value after quoted field isn't allowed in line 1.",
+ error.message)
+ assert_equal(['"quoted" field'],
+ CSV.parse_line(input, liberal_parsing: true))
+ end
+
+ def test_endline_after_quoted_field_end
+ csv = CSV.new("A\r\n\"B\"\nC\r\n", liberal_parsing: true)
+ assert_equal(["A"], csv.gets)
+ error = assert_raise(CSV::MalformedCSVError) do
+ csv.gets
+ end
+ assert_equal('Illegal end-of-line sequence outside of a quoted field <"\n"> in line 2.',
+ error.message)
+ assert_equal(["C"], csv.gets)
+ end
+
+ def test_quote_after_column_separator
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line('is,this "three," or four,fields', liberal_parsing: true)
+ end
+ assert_equal("Unclosed quoted field in line 1.",
+ error.message)
+ end
+
+ def test_quote_before_column_separator
+ assert_equal(["is", 'this "three', ' or four"', "fields"],
+ CSV.parse_line('is,this "three, or four",fields',
+ liberal_parsing: true))
+ end
+
+ def test_backslash_quote
+ assert_equal([
+ "1",
+ "\"Hamlet says, \\\"Seems",
+ "\\\" madam! Nay it is; I know not \\\"seems.\\\"\"",
+ ],
+ CSV.parse_line('1,' +
+ '"Hamlet says, \"Seems,' +
+ '\" madam! Nay it is; I know not \"seems.\""',
+ liberal_parsing: true))
+ end
+
+ def test_space_quote
+ input = <<~CSV
+ Los Angeles, 34°03'N, 118°15'W
+ New York City, 40°42'46"N, 74°00'21"W
+ Paris, 48°51'24"N, 2°21'03"E
+ CSV
+ assert_equal(
+ [
+ ["Los Angeles", " 34°03'N", " 118°15'W"],
+ ["New York City", " 40°42'46\"N", " 74°00'21\"W"],
+ ["Paris", " 48°51'24\"N", " 2°21'03\"E"],
+ ],
+ CSV.parse(input, liberal_parsing: true))
+ end
+
+ def test_double_quote_outside_quote
+ data = %Q{a,""b""}
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse(data)
+ end
+ assert_equal("Any value after quoted field isn't allowed in line 1.",
+ error.message)
+ assert_equal([
+ [["a", %Q{""b""}]],
+ [["a", %Q{"b"}]],
+ ],
+ [
+ CSV.parse(data, liberal_parsing: true),
+ CSV.parse(data,
+ liberal_parsing: {
+ double_quote_outside_quote: true,
+ }),
+ ])
+ end
+
+ class TestBackslashQuote < Test::Unit::TestCase
+ extend ::DifferentOFS
+
+ def test_double_quote_outside_quote
+ data = %Q{a,""b""}
+ assert_equal([
+ [["a", %Q{""b""}]],
+ [["a", %Q{"b"}]],
+ ],
+ [
+ CSV.parse(data,
+ liberal_parsing: {
+ backslash_quote: true
+ }),
+ CSV.parse(data,
+ liberal_parsing: {
+ backslash_quote: true,
+ double_quote_outside_quote: true
+ }),
+ ])
+ end
+
+ def test_unquoted_value
+ data = %q{\"\"a\"\"}
+ assert_equal([
+ [[%q{\"\"a\"\"}]],
+ [[%q{""a""}]],
+ ],
+ [
+ CSV.parse(data, liberal_parsing: true),
+ CSV.parse(data,
+ liberal_parsing: {
+ backslash_quote: true
+ }),
+ ])
+ end
+
+ def test_unquoted_value_multiple_characters_col_sep
+ data = %q{a<\\"b<=>x}
+ assert_equal([[%Q{a<"b}, "x"]],
+ CSV.parse(data,
+ col_sep: "<=>",
+ liberal_parsing: {
+ backslash_quote: true
+ }))
+ end
+
+ def test_quoted_value
+ data = %q{"\"\"a\"\""}
+ assert_equal([
+ [[%q{"\"\"a\"\""}]],
+ [[%q{""a""}]],
+ [[%q{""a""}]],
+ ],
+ [
+ CSV.parse(data, liberal_parsing: true),
+ CSV.parse(data,
+ liberal_parsing: {
+ backslash_quote: true
+ }),
+ CSV.parse(data,
+ liberal_parsing: {
+ backslash_quote: true,
+ double_quote_outside_quote: true
+ }),
+ ])
+ end
+ end
+end
diff --git a/test/csv/parse/test_quote_char_nil.rb b/test/csv/parse/test_quote_char_nil.rb
new file mode 100644
index 0000000..fc3b646
--- /dev/null
+++ b/test/csv/parse/test_quote_char_nil.rb
@@ -0,0 +1,93 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseQuoteCharNil < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_full
+ assert_equal(["a", "b"], CSV.parse_line(%Q{a,b}, quote_char: nil))
+ end
+
+ def test_end_with_nil
+ assert_equal(["a", nil, nil, nil], CSV.parse_line(%Q{a,,,}, quote_char: nil))
+ end
+
+ def test_nil_nil
+ assert_equal([nil, nil], CSV.parse_line(%Q{,}, quote_char: nil))
+ end
+
+ def test_unquoted_value_multiple_characters_col_sep
+ data = %q{a<b<=>x}
+ assert_equal([[%Q{a<b}, "x"]], CSV.parse(data, col_sep: "<=>", quote_char: nil))
+ end
+
+ def test_csv_header_string
+ data = <<~DATA
+ first,second,third
+ A,B,C
+ 1,2,3
+ DATA
+ assert_equal(
+ CSV::Table.new([
+ CSV::Row.new(["my", "new", "headers"], ["first", "second", "third"]),
+ CSV::Row.new(["my", "new", "headers"], ["A", "B", "C"]),
+ CSV::Row.new(["my", "new", "headers"], ["1", "2", "3"])
+ ]),
+ CSV.parse(data, headers: "my,new,headers", quote_char: nil)
+ )
+ end
+
+ def test_comma
+ assert_equal([["a", "b", nil, "d"]],
+ CSV.parse("a,b,,d", col_sep: ",", quote_char: nil))
+ end
+
+ def test_space
+ assert_equal([["a", "b", nil, "d"]],
+ CSV.parse("a b d", col_sep: " ", quote_char: nil))
+ end
+
+ def encode_array(array, encoding)
+ array.collect do |element|
+ element ? element.encode(encoding) : element
+ end
+ end
+
+ def test_space_no_ascii
+ encoding = Encoding::UTF_16LE
+ assert_equal([encode_array(["a", "b", nil, "d"], encoding)],
+ CSV.parse("a b d".encode(encoding),
+ col_sep: " ".encode(encoding),
+ quote_char: nil))
+ end
+
+ def test_multiple_space
+ assert_equal([["a b", nil, "d"]],
+ CSV.parse("a b d", col_sep: " ", quote_char: nil))
+ end
+
+ def test_multiple_characters_leading_empty_fields
+ data = <<-CSV
+<=><=>A<=>B<=>C
+1<=>2<=>3
+ CSV
+ assert_equal([
+ [nil, nil, "A", "B", "C"],
+ ["1", "2", "3"],
+ ],
+ CSV.parse(data, col_sep: "<=>", quote_char: nil))
+ end
+
+ def test_line
+ lines = [
+ "abc,def\n",
+ ]
+ csv = CSV.new(lines.join(""), quote_char: nil)
+ lines.each do |line|
+ csv.shift
+ assert_equal(line, csv.line)
+ end
+ end
+end
diff --git a/test/csv/parse/test_read.rb b/test/csv/parse/test_read.rb
new file mode 100644
index 0000000..ba6fe98
--- /dev/null
+++ b/test/csv/parse/test_read.rb
@@ -0,0 +1,27 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseRead < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_shift
+ data = <<-CSV
+1
+2
+3
+ CSV
+ csv = CSV.new(data)
+ assert_equal([
+ ["1"],
+ [["2"], ["3"]],
+ nil,
+ ],
+ [
+ csv.shift,
+ csv.read,
+ csv.shift,
+ ])
+ end
+end
diff --git a/test/csv/parse/test_rewind.rb b/test/csv/parse/test_rewind.rb
new file mode 100644
index 0000000..0aa403b
--- /dev/null
+++ b/test/csv/parse/test_rewind.rb
@@ -0,0 +1,40 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseRewind < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def parse(data, **options)
+ csv = CSV.new(data, **options)
+ records = csv.to_a
+ csv.rewind
+ [records, csv.to_a]
+ end
+
+ def test_default
+ data = <<-CSV
+Ruby,2.6.0,script
+ CSV
+ assert_equal([
+ [["Ruby", "2.6.0", "script"]],
+ [["Ruby", "2.6.0", "script"]],
+ ],
+ parse(data))
+ end
+
+ def test_have_headers
+ data = <<-CSV
+Language,Version,Type
+Ruby,2.6.0,script
+ CSV
+ assert_equal([
+ [CSV::Row.new(["Language", "Version", "Type"],
+ ["Ruby", "2.6.0", "script"])],
+ [CSV::Row.new(["Language", "Version", "Type"],
+ ["Ruby", "2.6.0", "script"])],
+ ],
+ parse(data, headers: true))
+ end
+end
diff --git a/test/csv/parse/test_row_separator.rb b/test/csv/parse/test_row_separator.rb
new file mode 100644
index 0000000..eaf6adc
--- /dev/null
+++ b/test/csv/parse/test_row_separator.rb
@@ -0,0 +1,16 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseRowSeparator < Test::Unit::TestCase
+ extend DifferentOFS
+ include Helper
+
+ def test_multiple_characters
+ with_chunk_size("1") do
+ assert_equal([["a"], ["b"]],
+ CSV.parse("a\r\nb\r\n", row_sep: "\r\n"))
+ end
+ end
+end
diff --git a/test/csv/parse/test_skip_lines.rb b/test/csv/parse/test_skip_lines.rb
new file mode 100644
index 0000000..98d67ae
--- /dev/null
+++ b/test/csv/parse/test_skip_lines.rb
@@ -0,0 +1,118 @@
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseSkipLines < Test::Unit::TestCase
+ extend DifferentOFS
+ include Helper
+
+ def test_default
+ csv = CSV.new("a,b,c\n")
+ assert_nil(csv.skip_lines)
+ end
+
+ def test_regexp
+ csv = <<-CSV
+1
+#2
+ #3
+4
+ CSV
+ assert_equal([
+ ["1"],
+ ["4"],
+ ],
+ CSV.parse(csv, :skip_lines => /\A\s*#/))
+ end
+
+ def test_regexp_quoted
+ csv = <<-CSV
+1
+#2
+"#3"
+4
+ CSV
+ assert_equal([
+ ["1"],
+ ["#3"],
+ ["4"],
+ ],
+ CSV.parse(csv, :skip_lines => /\A\s*#/))
+ end
+
+ def test_string
+ csv = <<-CSV
+1
+.2
+3.
+4
+ CSV
+ assert_equal([
+ ["1"],
+ ["4"],
+ ],
+ CSV.parse(csv, :skip_lines => "."))
+ end
+
+ class RegexStub
+ end
+
+ def test_not_matchable
+ regex_stub = RegexStub.new
+ csv = CSV.new("1\n", :skip_lines => regex_stub)
+ error = assert_raise(ArgumentError) do
+ csv.shift
+ end
+ assert_equal(":skip_lines has to respond to #match: #{regex_stub.inspect}",
+ error.message)
+ end
+
+ class Matchable
+ def initialize(pattern)
+ @pattern = pattern
+ end
+
+ def match(line)
+ @pattern.match(line)
+ end
+ end
+
+ def test_matchable
+ csv = <<-CSV
+1
+# 2
+3
+# 4
+ CSV
+ assert_equal([
+ ["1"],
+ ["3"],
+ ],
+ CSV.parse(csv, :skip_lines => Matchable.new(/\A#/)))
+ end
+
+ def test_multibyte_data
+ # U+3042 HIRAGANA LETTER A
+ # U+3044 HIRAGANA LETTER I
+ # U+3046 HIRAGANA LETTER U
+ value = "\u3042\u3044\u3046"
+ with_chunk_size("5") do
+ assert_equal([[value], [value]],
+ CSV.parse("#{value}\n#{value}\n",
+ :skip_lines => /\A#/))
+ end
+ end
+
+ def test_empty_line_and_liberal_parsing
+ assert_equal([["a", "b"]],
+ CSV.parse("a,b\n",
+ :liberal_parsing => true,
+ :skip_lines => /^$/))
+ end
+
+ def test_crlf
+ assert_equal([["a", "b"]],
+ CSV.parse("a,b\r\n,\r\n",
+ :skip_lines => /^,+$/))
+ end
+end
diff --git a/test/csv/parse/test_strip.rb b/test/csv/parse/test_strip.rb
new file mode 100644
index 0000000..c5e3520
--- /dev/null
+++ b/test/csv/parse/test_strip.rb
@@ -0,0 +1,112 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseStrip < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def test_both
+ assert_equal(["a", "b"],
+ CSV.parse_line(%Q{ a , b }, strip: true))
+ end
+
+ def test_left
+ assert_equal(["a", "b"],
+ CSV.parse_line(%Q{ a, b}, strip: true))
+ end
+
+ def test_right
+ assert_equal(["a", "b"],
+ CSV.parse_line(%Q{a ,b }, strip: true))
+ end
+
+ def test_middle
+ assert_equal(["a b"],
+ CSV.parse_line(%Q{a b}, strip: true))
+ end
+
+ def test_quoted
+ assert_equal([" a ", " b "],
+ CSV.parse_line(%Q{" a "," b "}, strip: true))
+ end
+
+ def test_liberal_parsing
+ assert_equal([" a ", "b", " c ", " d "],
+ CSV.parse_line(%Q{" a ", b , " c "," d " },
+ strip: true,
+ liberal_parsing: true))
+ end
+
+ def test_string
+ assert_equal(["a", " b"],
+ CSV.parse_line(%Q{ a , " b" },
+ strip: " "))
+ end
+
+ def test_no_quote
+ assert_equal([" a ", " b "],
+ CSV.parse_line(%Q{" a ", b },
+ strip: %Q{"},
+ quote_char: nil))
+ end
+
+ def test_do_not_strip_cr
+ assert_equal([
+ ["a", "b "],
+ ["a", "b "],
+ ],
+ CSV.parse(%Q{"a" ,"b " \r} +
+ %Q{"a" ,"b " \r},
+ strip: true))
+ end
+
+ def test_do_not_strip_lf
+ assert_equal([
+ ["a", "b "],
+ ["a", "b "],
+ ],
+ CSV.parse(%Q{"a" ,"b " \n} +
+ %Q{"a" ,"b " \n},
+ strip: true))
+ end
+
+ def test_do_not_strip_crlf
+ assert_equal([
+ ["a", "b "],
+ ["a", "b "],
+ ],
+ CSV.parse(%Q{"a" ,"b " \r\n} +
+ %Q{"a" ,"b " \r\n},
+ strip: true))
+ end
+
+ def test_col_sep_incompatible_true
+ message = "The provided strip (true) and " \
+ "col_sep (\\t) options are incompatible."
+ assert_raise_with_message(ArgumentError, message) do
+ CSV.parse_line(%Q{"a"\t"b"\n},
+ col_sep: "\t",
+ strip: true)
+ end
+ end
+
+ def test_col_sep_incompatible_string
+ message = "The provided strip (\\t) and " \
+ "col_sep (\\t) options are incompatible."
+ assert_raise_with_message(ArgumentError, message) do
+ CSV.parse_line(%Q{"a"\t"b"\n},
+ col_sep: "\t",
+ strip: "\t")
+ end
+ end
+
+ def test_col_sep_compatible_string
+ assert_equal(
+ ["a", "b"],
+ CSV.parse_line(%Q{\va\tb\v\n},
+ col_sep: "\t",
+ strip: "\v")
+ )
+ end
+end
diff --git a/test/csv/parse/test_unconverted_fields.rb b/test/csv/parse/test_unconverted_fields.rb
new file mode 100644
index 0000000..437124e
--- /dev/null
+++ b/test/csv/parse/test_unconverted_fields.rb
@@ -0,0 +1,117 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+class TestCSVParseUnconvertedFields < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @custom = lambda {|field| /\A:(\S.*?)\s*\Z/ =~ field ? $1.to_sym : field}
+
+ @headers = ["first", "second", "third"]
+ @data = <<-CSV
+first,second,third
+1,2,3
+ CSV
+ end
+
+
+ def test_custom
+ row = CSV.parse_line("Numbers,:integer,1,:float,3.015",
+ converters: [:numeric, @custom],
+ unconverted_fields: true)
+ assert_equal([
+ ["Numbers", :integer, 1, :float, 3.015],
+ ["Numbers", ":integer", "1", ":float", "3.015"],
+ ],
+ [
+ row,
+ row.unconverted_fields,
+ ])
+ end
+
+ def test_no_fields
+ row = CSV.parse_line("\n",
+ converters: [:numeric, @custom],
+ unconverted_fields: true)
+ assert_equal([
+ [],
+ [],
+ ],
+ [
+ row,
+ row.unconverted_fields,
+ ])
+ end
+
+ def test_parsed_header
+ row = CSV.parse_line(@data,
+ converters: :numeric,
+ unconverted_fields: true,
+ headers: :first_row)
+ assert_equal([
+ CSV::Row.new(@headers,
+ [1, 2, 3]),
+ ["1", "2", "3"],
+ ],
+ [
+ row,
+ row.unconverted_fields,
+ ])
+ end
+
+ def test_return_headers
+ row = CSV.parse_line(@data,
+ converters: :numeric,
+ unconverted_fields: true,
+ headers: :first_row,
+ return_headers: true)
+ assert_equal([
+ CSV::Row.new(@headers,
+ @headers),
+ @headers,
+ ],
+ [
+ row,
+ row.unconverted_fields,
+ ])
+ end
+
+ def test_header_converters
+ row = CSV.parse_line(@data,
+ converters: :numeric,
+ unconverted_fields: true,
+ headers: :first_row,
+ return_headers: true,
+ header_converters: :symbol)
+ assert_equal([
+ CSV::Row.new(@headers.collect(&:to_sym),
+ @headers),
+ @headers,
+ ],
+ [
+ row,
+ row.unconverted_fields,
+ ])
+ end
+
+ def test_specified_headers
+ row = CSV.parse_line("\n",
+ converters: :numeric,
+ unconverted_fields: true,
+ headers: %w{my new headers},
+ return_headers: true,
+ header_converters: :symbol)
+ assert_equal([
+ CSV::Row.new([:my, :new, :headers],
+ ["my", "new", "headers"]),
+ [],
+ ],
+ [
+ row,
+ row.unconverted_fields,
+ ])
+ end
+end
diff --git a/test/csv/test_data_converters.rb b/test/csv/test_data_converters.rb
new file mode 100644
index 0000000..c20a5d1
--- /dev/null
+++ b/test/csv/test_data_converters.rb
@@ -0,0 +1,190 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "helper"
+
+class TestCSVDataConverters < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @win_safe_time_str = Time.now.strftime("%a %b %d %H:%M:%S %Y")
+ end
+
+ def test_builtin_integer_converter
+ # does convert
+ [-5, 1, 10000000000].each do |n|
+ assert_equal(n, CSV::Converters[:integer][n.to_s])
+ end
+
+ # does not convert
+ (%w{junk 1.0} + [""]).each do |str|
+ assert_equal(str, CSV::Converters[:integer][str])
+ end
+ end
+
+ def test_builtin_float_converter
+ # does convert
+ [-5.1234, 0, 2.3e-11].each do |n|
+ assert_equal(n, CSV::Converters[:float][n.to_s])
+ end
+
+ # does not convert
+ (%w{junk 1..0 .015F} + [""]).each do |str|
+ assert_equal(str, CSV::Converters[:float][str])
+ end
+ end
+
+ def test_builtin_date_converter
+ # does convert
+ assert_instance_of(
+ Date,
+ CSV::Converters[:date][@win_safe_time_str.sub(/\d+:\d+:\d+ /, "")]
+ )
+
+ # does not convert
+ assert_instance_of(String, CSV::Converters[:date]["junk"])
+ end
+
+ def test_builtin_date_time_converter
+ # does convert
+ assert_instance_of( DateTime,
+ CSV::Converters[:date_time][@win_safe_time_str] )
+
+ # does not convert
+ assert_instance_of(String, CSV::Converters[:date_time]["junk"])
+ end
+
+ def test_builtin_date_time_converter_iso8601_date
+ iso8601_string = "2018-01-14"
+ datetime = DateTime.new(2018, 1, 14)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_iso8601_minute
+ iso8601_string = "2018-01-14T22:25"
+ datetime = DateTime.new(2018, 1, 14, 22, 25)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_iso8601_second
+ iso8601_string = "2018-01-14T22:25:19"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_iso8601_under_second
+ iso8601_string = "2018-01-14T22:25:19.1"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19.1)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_iso8601_under_second_offset
+ iso8601_string = "2018-01-14T22:25:19.1+09:00"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19.1, "+9")
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_iso8601_offset
+ iso8601_string = "2018-01-14T22:25:19+09:00"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19, "+9")
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_iso8601_utc
+ iso8601_string = "2018-01-14T22:25:19Z"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][iso8601_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_minute
+ rfc3339_string = "2018-01-14 22:25"
+ datetime = DateTime.new(2018, 1, 14, 22, 25)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_second
+ rfc3339_string = "2018-01-14 22:25:19"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_under_second
+ rfc3339_string = "2018-01-14 22:25:19.1"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19.1)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_under_second_offset
+ rfc3339_string = "2018-01-14 22:25:19.1+09:00"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19.1, "+9")
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_offset
+ rfc3339_string = "2018-01-14 22:25:19+09:00"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19, "+9")
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_utc
+ rfc3339_string = "2018-01-14 22:25:19Z"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_tab_minute
+ rfc3339_string = "2018-01-14\t22:25"
+ datetime = DateTime.new(2018, 1, 14, 22, 25)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_tab_second
+ rfc3339_string = "2018-01-14\t22:25:19"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_tab_under_second
+ rfc3339_string = "2018-01-14\t22:25:19.1"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19.1)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_tab_under_second_offset
+ rfc3339_string = "2018-01-14\t22:25:19.1+09:00"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19.1, "+9")
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_tab_offset
+ rfc3339_string = "2018-01-14\t22:25:19+09:00"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19, "+9")
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+
+ def test_builtin_date_time_converter_rfc3339_tab_utc
+ rfc3339_string = "2018-01-14\t22:25:19Z"
+ datetime = DateTime.new(2018, 1, 14, 22, 25, 19)
+ assert_equal(datetime,
+ CSV::Converters[:date_time][rfc3339_string])
+ end
+end
diff --git a/test/csv/test_encodings.rb b/test/csv/test_encodings.rb
new file mode 100644
index 0000000..f08d551
--- /dev/null
+++ b/test/csv/test_encodings.rb
@@ -0,0 +1,403 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "helper"
+
+class TestCSVEncodings < Test::Unit::TestCase
+ extend DifferentOFS
+ include Helper
+
+ def setup
+ super
+ require 'tempfile'
+ @temp_csv_file = Tempfile.new(%w"test_csv. .csv")
+ @temp_csv_path = @temp_csv_file.path
+ @temp_csv_file.close
+ end
+
+ def teardown
+ @temp_csv_file.close!
+ super
+ end
+
+ ########################################
+ ### Hand Test Some Popular Encodings ###
+ ########################################
+
+ def test_parses_utf8_encoding
+ assert_parses( [ %w[ one two … ],
+ %w[ 1 … 3 ],
+ %w[ … 5 6 ] ], "UTF-8" )
+ end
+
+ def test_parses_latin1_encoding
+ assert_parses( [ %w[ one two Résumé ],
+ %w[ 1 Résumé 3 ],
+ %w[ Résumé 5 6 ] ], "ISO-8859-1" )
+ end
+
+ def test_parses_utf16be_encoding
+ assert_parses( [ %w[ one two … ],
+ %w[ 1 … 3 ],
+ %w[ … 5 6 ] ], "UTF-16BE" )
+ end
+
+ def test_parses_shift_jis_encoding
+ assert_parses( [ %w[ 一 二 三 ],
+ %w[ 四 五 六 ],
+ %w[ 七 八 九 ] ], "Shift_JIS" )
+ end
+
+ ###########################################################
+ ### Try Simple Reading for All Non-dummy Ruby Encodings ###
+ ###########################################################
+
+ def test_reading_with_most_encodings
+ each_encoding do |encoding|
+ begin
+ assert_parses( [ %w[ abc def ],
+ %w[ ghi jkl ] ], encoding )
+ rescue Encoding::ConverterNotFoundError
+ fail("Failed to support #{encoding.name}.")
+ end
+ end
+ end
+
+ def test_regular_expression_escaping
+ each_encoding do |encoding|
+ begin
+ assert_parses( [ %w[ abc def ],
+ %w[ ghi jkl ] ], encoding, col_sep: "|" )
+ rescue Encoding::ConverterNotFoundError
+ fail("Failed to properly escape #{encoding.name}.")
+ end
+ end
+ end
+
+ def test_read_with_default_encoding
+ data = "abc"
+ default_external = Encoding.default_external
+ each_encoding do |encoding|
+ File.open(@temp_csv_path, "wb", encoding: encoding) {|f| f << data}
+ begin
+ no_warnings do
+ Encoding.default_external = encoding
+ end
+ result = CSV.read(@temp_csv_path)[0][0]
+ ensure
+ no_warnings do
+ Encoding.default_external = default_external
+ end
+ end
+ assert_equal(encoding, result.encoding)
+ end
+ end
+
+ #######################################################################
+ ### Stress Test ASCII Compatible and Non-ASCII Compatible Encodings ###
+ #######################################################################
+
+ def test_auto_line_ending_detection
+ # arrange data to place a \r at the end of CSV's read ahead point
+ encode_for_tests([["a" * 509]], row_sep: "\r\n") do |data|
+ assert_equal("\r\n".encode(data.encoding), CSV.new(data).row_sep)
+ end
+ end
+
+ def test_csv_chars_are_transcoded
+ encode_for_tests([%w[abc def]]) do |data|
+ %w[col_sep row_sep quote_char].each do |csv_char|
+ assert_equal( "|".encode(data.encoding),
+ CSV.new(data, csv_char.to_sym => "|").send(csv_char) )
+ end
+ end
+ end
+
+ def test_parser_works_with_encoded_headers
+ encode_for_tests([%w[one two three], %w[1 2 3]]) do |data|
+ parsed = CSV.parse(data, headers: true)
+ assert_all?(parsed.headers, "Wrong data encoding.") {|h| h.encoding == data.encoding}
+ parsed.each do |row|
+ assert_all?(row.fields, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ end
+ end
+ end
+
+ def test_built_in_converters_transcode_to_utf_8_then_convert
+ encode_for_tests([%w[one two three], %w[1 2 3]]) do |data|
+ parsed = CSV.parse(data, converters: :integer)
+ assert_all?(parsed[0], "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ assert_equal([1, 2, 3], parsed[1])
+ end
+ end
+
+ def test_built_in_header_converters_transcode_to_utf_8_then_convert
+ encode_for_tests([%w[one two three], %w[1 2 3]]) do |data|
+ parsed = CSV.parse( data, headers: true,
+ header_converters: :downcase )
+ assert_all?(parsed.headers, "Wrong data encoding.") {|h| h.encoding.name == "UTF-8"}
+ assert_all?(parsed[0].fields, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ end
+ end
+
+ def test_open_allows_you_to_set_encodings
+ encode_for_tests([%w[abc def]]) do |data|
+ # read and write in encoding
+ File.open(@temp_csv_path, "wb:#{data.encoding.name}") { |f| f << data }
+ CSV.open(@temp_csv_path, "rb:#{data.encoding.name}") do |csv|
+ csv.each do |row|
+ assert_all?(row, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ end
+ end
+
+ # read and write with transcoding
+ File.open(@temp_csv_path, "wb:UTF-32BE:#{data.encoding.name}") do |f|
+ f << data
+ end
+ CSV.open(@temp_csv_path, "rb:UTF-32BE:#{data.encoding.name}") do |csv|
+ csv.each do |row|
+ assert_all?(row, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ end
+ end
+ end
+ end
+
+ def test_foreach_allows_you_to_set_encodings
+ encode_for_tests([%w[abc def]]) do |data|
+ # read and write in encoding
+ File.open(@temp_csv_path, "wb", encoding: data.encoding) { |f| f << data }
+ CSV.foreach(@temp_csv_path, encoding: data.encoding) do |row|
+ row.each {|f| assert_equal(f.encoding, data.encoding)}
+ end
+
+ # read and write with transcoding
+ File.open(@temp_csv_path, "wb:UTF-32BE:#{data.encoding.name}") do |f|
+ f << data
+ end
+ CSV.foreach( @temp_csv_path,
+ encoding: "UTF-32BE:#{data.encoding.name}" ) do |row|
+ assert_all?(row, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ end
+ end
+ end
+
+ def test_read_allows_you_to_set_encodings
+ encode_for_tests([%w[abc def]]) do |data|
+ # read and write in encoding
+ File.open(@temp_csv_path, "wb:#{data.encoding.name}") { |f| f << data }
+ rows = CSV.read(@temp_csv_path, encoding: data.encoding.name)
+ assert_all?(rows.flatten, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+
+ # read and write with transcoding
+ File.open(@temp_csv_path, "wb:UTF-32BE:#{data.encoding.name}") do |f|
+ f << data
+ end
+ rows = CSV.read( @temp_csv_path,
+ encoding: "UTF-32BE:#{data.encoding.name}" )
+ assert_all?(rows.flatten, "Wrong data encoding.") {|f| f.encoding == data.encoding}
+ end
+ end
+
+ #################################
+ ### Write CSV in any Encoding ###
+ #################################
+
+ def test_can_write_csv_in_any_encoding
+ each_encoding do |encoding|
+ # test generate_line with encoding hint
+ begin
+ csv = %w[abc d|ef].map { |f| f.encode(encoding) }.
+ to_csv(col_sep: "|", encoding: encoding.name)
+ rescue Encoding::ConverterNotFoundError
+ next
+ end
+ assert_equal(encoding, csv.encoding)
+
+ # test generate_line with encoding guessing from fields
+ csv = %w[abc d|ef].map { |f| f.encode(encoding) }.to_csv(col_sep: "|")
+ assert_equal(encoding, csv.encoding)
+
+ # writing to files
+ data = encode_ary([%w[abc d,ef], %w[123 456 ]], encoding)
+ CSV.open(@temp_csv_path, "wb:#{encoding.name}") do |f|
+ data.each { |row| f << row }
+ end
+ assert_equal(data, CSV.read(@temp_csv_path, encoding: encoding.name))
+ end
+ end
+
+ def test_encoding_is_upgraded_during_writing_as_needed
+ data = ["foo".force_encoding("US-ASCII"), "\u3042"]
+ assert_equal("US-ASCII", data.first.encoding.name)
+ assert_equal("UTF-8", data.last.encoding.name)
+ assert_equal("UTF-8", data.join('').encoding.name)
+ assert_equal("UTF-8", data.to_csv.encoding.name)
+ end
+
+ def test_encoding_is_upgraded_for_ascii_content_during_writing_as_needed
+ data = ["foo".force_encoding("ISO-8859-1"), "\u3042"]
+ assert_equal("ISO-8859-1", data.first.encoding.name)
+ assert_equal("UTF-8", data.last.encoding.name)
+ assert_equal("UTF-8", data.join('').encoding.name)
+ assert_equal("UTF-8", data.to_csv.encoding.name)
+ end
+
+ def test_encoding_is_not_upgraded_for_non_ascii_content_during_writing_as_needed
+ data = ["\u00c0".encode("ISO-8859-1"), "\u3042"]
+ assert_equal([
+ "ISO-8859-1",
+ "UTF-8",
+ ],
+ data.collect {|field| field.encoding.name})
+ assert_raise(Encoding::CompatibilityError) do
+ data.to_csv
+ end
+ end
+
+ def test_explicit_encoding
+ bug9766 = '[ruby-core:62113] [Bug #9766]'
+ s = CSV.generate(encoding: "Windows-31J") do |csv|
+ csv << ["foo".force_encoding("ISO-8859-1"), "\u3042"]
+ end
+ assert_equal(["foo,\u3042\n".encode(Encoding::Windows_31J), Encoding::Windows_31J], [s, s.encoding], bug9766)
+ end
+
+ def test_encoding_with_default_internal
+ with_default_internal(Encoding::UTF_8) do
+ s = CSV.generate(String.new(encoding: Encoding::Big5), encoding: Encoding::Big5) do |csv|
+ csv << ["漢字"]
+ end
+ assert_equal(["漢字\n".encode(Encoding::Big5), Encoding::Big5], [s, s.encoding])
+ end
+ end
+
+ def test_row_separator_detection_with_invalid_encoding
+ csv = CSV.new("invalid,\xF8\r\nvalid,x\r\n".force_encoding("UTF-8"),
+ encoding: "UTF-8")
+ assert_equal("\r\n", csv.row_sep)
+ end
+
+ def test_invalid_encoding_row_error
+ csv = CSV.new("valid,x\rinvalid,\xF8\r".force_encoding("UTF-8"),
+ encoding: "UTF-8", row_sep: "\r")
+ error = assert_raise(CSV::MalformedCSVError) do
+ csv.shift
+ csv.shift
+ end
+ assert_equal("Invalid byte sequence in UTF-8 in line 2.",
+ error.message)
+ end
+
+ def test_string_input_transcode
+ # U+3042 HIRAGANA LETTER A
+ # U+3044 HIRAGANA LETTER I
+ # U+3046 HIRAGANA LETTER U
+ value = "\u3042\u3044\u3046"
+ csv = CSV.new(value, encoding: "UTF-8:EUC-JP")
+ assert_equal([[value.encode("EUC-JP")]],
+ csv.read)
+ end
+
+ def test_string_input_set_encoding_string
+ # U+3042 HIRAGANA LETTER A
+ # U+3044 HIRAGANA LETTER I
+ # U+3046 HIRAGANA LETTER U
+ value = "\u3042\u3044\u3046".encode("EUC-JP")
+ csv = CSV.new(value.dup.force_encoding("UTF-8"), encoding: "EUC-JP")
+ assert_equal([[value.encode("EUC-JP")]],
+ csv.read)
+ end
+
+ def test_string_input_set_encoding_encoding
+ # U+3042 HIRAGANA LETTER A
+ # U+3044 HIRAGANA LETTER I
+ # U+3046 HIRAGANA LETTER U
+ value = "\u3042\u3044\u3046".encode("EUC-JP")
+ csv = CSV.new(value.dup.force_encoding("UTF-8"),
+ encoding: Encoding.find("EUC-JP"))
+ assert_equal([[value.encode("EUC-JP")]],
+ csv.read)
+ end
+
+ private
+
+ def assert_parses(fields, encoding, **options)
+ encoding = Encoding.find(encoding) unless encoding.is_a? Encoding
+ orig_fields = fields
+ fields = encode_ary(fields, encoding)
+ data = ary_to_data(fields, **options)
+ parsed = CSV.parse(data, **options)
+ assert_equal(fields, parsed)
+ parsed.flatten.each_with_index do |field, i|
+ assert_equal(encoding, field.encoding, "Field[#{i + 1}] was transcoded.")
+ end
+ File.open(@temp_csv_path, "wb") {|f| f.print(data)}
+ CSV.open(@temp_csv_path, "rb:#{encoding}", **options) do |csv|
+ csv.each_with_index do |row, i|
+ assert_equal(fields[i], row)
+ end
+ end
+ begin
+ CSV.open(@temp_csv_path,
+ "rb:#{encoding}:#{__ENCODING__}",
+ **options) do |csv|
+ csv.each_with_index do |row, i|
+ assert_equal(orig_fields[i], row)
+ end
+ end unless encoding == __ENCODING__
+ rescue Encoding::ConverterNotFoundError
+ end
+ options[:encoding] = encoding.name
+ CSV.open(@temp_csv_path, **options) do |csv|
+ csv.each_with_index do |row, i|
+ assert_equal(fields[i], row)
+ end
+ end
+ options.delete(:encoding)
+ options[:external_encoding] = encoding.name
+ options[:internal_encoding] = __ENCODING__.name
+ begin
+ CSV.open(@temp_csv_path, **options) do |csv|
+ csv.each_with_index do |row, i|
+ assert_equal(orig_fields[i], row)
+ end
+ end unless encoding == __ENCODING__
+ rescue Encoding::ConverterNotFoundError
+ end
+ end
+
+ def encode_ary(ary, encoding)
+ ary.map { |row| row.map { |field| field.encode(encoding) } }
+ end
+
+ def ary_to_data(ary, **options)
+ encoding = ary.flatten.first.encoding
+ quote_char = (options[:quote_char] || '"').encode(encoding)
+ col_sep = (options[:col_sep] || ",").encode(encoding)
+ row_sep = (options[:row_sep] || "\n").encode(encoding)
+ ary.map { |row|
+ row.map { |field|
+ [quote_char, field.encode(encoding), quote_char].join('')
+ }.join(col_sep) + row_sep
+ }.join('').encode(encoding)
+ end
+
+ def encode_for_tests(data, **options)
+ yield ary_to_data(encode_ary(data, "UTF-8"), **options)
+ yield ary_to_data(encode_ary(data, "UTF-16BE"), **options)
+ end
+
+ def each_encoding
+ Encoding.list.each do |encoding|
+ next if encoding.dummy? # skip "dummy" encodings
+ yield encoding
+ end
+ end
+
+ def no_warnings
+ old_verbose, $VERBOSE = $VERBOSE, nil
+ yield
+ ensure
+ $VERBOSE = old_verbose
+ end
+end
diff --git a/test/csv/test_features.rb b/test/csv/test_features.rb
new file mode 100644
index 0000000..d6eb2dc
--- /dev/null
+++ b/test/csv/test_features.rb
@@ -0,0 +1,359 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+begin
+ require "zlib"
+rescue LoadError
+end
+
+require_relative "helper"
+require "tempfile"
+
+class TestCSVFeatures < Test::Unit::TestCase
+ extend DifferentOFS
+
+ TEST_CASES = [ [%Q{a,b}, ["a", "b"]],
+ [%Q{a,"""b"""}, ["a", "\"b\""]],
+ [%Q{a,"""b"}, ["a", "\"b"]],
+ [%Q{a,"b"""}, ["a", "b\""]],
+ [%Q{a,"\nb"""}, ["a", "\nb\""]],
+ [%Q{a,"""\nb"}, ["a", "\"\nb"]],
+ [%Q{a,"""\nb\n"""}, ["a", "\"\nb\n\""]],
+ [%Q{a,"""\nb\n""",\nc}, ["a", "\"\nb\n\"", nil]],
+ [%Q{a,,,}, ["a", nil, nil, nil]],
+ [%Q{,}, [nil, nil]],
+ [%Q{"",""}, ["", ""]],
+ [%Q{""""}, ["\""]],
+ [%Q{"""",""}, ["\"",""]],
+ [%Q{,""}, [nil,""]],
+ [%Q{,"\r"}, [nil,"\r"]],
+ [%Q{"\r\n,"}, ["\r\n,"]],
+ [%Q{"\r\n,",}, ["\r\n,", nil]] ]
+
+ def setup
+ super
+ @sample_data = <<-CSV
+line,1,abc
+line,2,"def\nghi"
+
+line,4,jkl
+ CSV
+ @csv = CSV.new(@sample_data)
+ end
+
+ def test_col_sep
+ [";", "\t"].each do |sep|
+ TEST_CASES.each do |test_case|
+ assert_equal( test_case.last.map { |t| t.tr(",", sep) unless t.nil? },
+ CSV.parse_line( test_case.first.tr(",", sep),
+ col_sep: sep ) )
+ end
+ end
+ assert_equal([",,,", nil], CSV.parse_line(",,,;", col_sep: ";"))
+ end
+
+ def test_col_sep_nil
+ assert_raise_with_message(ArgumentError,
+ ":col_sep must be 1 or more characters: nil") do
+ CSV.parse(@sample_data, col_sep: nil)
+ end
+ end
+
+ def test_col_sep_empty
+ assert_raise_with_message(ArgumentError,
+ ":col_sep must be 1 or more characters: \"\"") do
+ CSV.parse(@sample_data, col_sep: "")
+ end
+ end
+
+ def test_row_sep
+ error = assert_raise(CSV::MalformedCSVError) do
+ CSV.parse_line("1,2,3\n,4,5\r\n", row_sep: "\r\n")
+ end
+ assert_equal("Unquoted fields do not allow new line <\"\\n\"> in line 1.",
+ error.message)
+ assert_equal( ["1", "2", "3\n", "4", "5"],
+ CSV.parse_line(%Q{1,2,"3\n",4,5\r\n}, row_sep: "\r\n"))
+ end
+
+ def test_quote_char
+ TEST_CASES.each do |test_case|
+ assert_equal(test_case.last.map {|t| t.tr('"', "'") unless t.nil?},
+ CSV.parse_line(test_case.first.tr('"', "'"),
+ quote_char: "'" ))
+ end
+ end
+
+ def test_quote_char_special_regexp_char
+ TEST_CASES.each do |test_case|
+ assert_equal(test_case.last.map {|t| t.tr('"', "|") unless t.nil?},
+ CSV.parse_line(test_case.first.tr('"', "|"),
+ quote_char: "|"))
+ end
+ end
+
+ def test_quote_char_special_regexp_char_liberal_parsing
+ TEST_CASES.each do |test_case|
+ assert_equal(test_case.last.map {|t| t.tr('"', "|") unless t.nil?},
+ CSV.parse_line(test_case.first.tr('"', "|"),
+ quote_char: "|",
+ liberal_parsing: true))
+ end
+ end
+
+ def test_csv_char_readers
+ %w[col_sep row_sep quote_char].each do |reader|
+ csv = CSV.new("abc,def", reader.to_sym => "|")
+ assert_equal("|", csv.send(reader))
+ end
+ end
+
+ def test_row_sep_auto_discovery
+ ["\r\n", "\n", "\r"].each do |line_end|
+ data = "1,2,3#{line_end}4,5#{line_end}"
+ discovered = CSV.new(data).row_sep
+ assert_equal(line_end, discovered)
+ end
+
+ assert_equal("\n", CSV.new("\n\r\n\r").row_sep)
+
+ assert_equal($/, CSV.new("").row_sep)
+
+ assert_equal($/, CSV.new(STDERR).row_sep)
+ end
+
+ def test_line
+ lines = [
+ %Q(\u{3000}abc,def\n),
+ %Q(\u{3000}abc,"d\nef"\n),
+ %Q(\u{3000}abc,"d\r\nef"\n),
+ %Q(\u{3000}abc,"d\ref")
+ ]
+ csv = CSV.new(lines.join(''))
+ lines.each do |line|
+ csv.shift
+ assert_equal(line, csv.line)
+ end
+ end
+
+ def test_lineno
+ assert_equal(5, @sample_data.lines.to_a.size)
+
+ 4.times do |line_count|
+ assert_equal(line_count, @csv.lineno)
+ assert_not_nil(@csv.shift)
+ assert_equal(line_count + 1, @csv.lineno)
+ end
+ assert_nil(@csv.shift)
+ end
+
+ def test_readline
+ test_lineno
+
+ @csv.rewind
+
+ test_lineno
+ end
+
+ def test_unknown_options
+ assert_raise_with_message(ArgumentError, /unknown keyword/) {
+ CSV.new(@sample_data, unknown: :error)
+ }
+ assert_raise_with_message(ArgumentError, /unknown keyword/) {
+ CSV.new(@sample_data, universal_newline: true)
+ }
+ end
+
+ def test_skip_blanks
+ assert_equal(4, @csv.to_a.size)
+
+ @csv = CSV.new(@sample_data, skip_blanks: true)
+
+ count = 0
+ @csv.each do |row|
+ count += 1
+ assert_equal("line", row.first)
+ end
+ assert_equal(3, count)
+ end
+
+ def test_csv_behavior_readers
+ %w[ unconverted_fields return_headers write_headers
+ skip_blanks force_quotes ].each do |behavior|
+ assert_not_predicate(CSV.new("abc,def"), "#{behavior}?", "Behavior defaulted to on.")
+ csv = CSV.new("abc,def", behavior.to_sym => true)
+ assert_predicate(csv, "#{behavior}?", "Behavior change now registered.")
+ end
+ end
+
+ def test_converters_reader
+ # no change
+ assert_equal( [:integer],
+ CSV.new("abc,def", converters: [:integer]).converters )
+
+ # just one
+ assert_equal( [:integer],
+ CSV.new("abc,def", converters: :integer).converters )
+
+ # expanded
+ assert_equal( [:integer, :float],
+ CSV.new("abc,def", converters: :numeric).converters )
+
+ # custom
+ csv = CSV.new("abc,def", converters: [:integer, lambda { }])
+ assert_equal(2, csv.converters.size)
+ assert_equal(:integer, csv.converters.first)
+ assert_instance_of(Proc, csv.converters.last)
+ end
+
+ def test_header_converters_reader
+ # no change
+ hc = :header_converters
+ assert_equal([:downcase], CSV.new("abc,def", hc => [:downcase]).send(hc))
+
+ # just one
+ assert_equal([:downcase], CSV.new("abc,def", hc => :downcase).send(hc))
+
+ # custom
+ csv = CSV.new("abc,def", hc => [:symbol, lambda { }])
+ assert_equal(2, csv.send(hc).size)
+ assert_equal(:symbol, csv.send(hc).first)
+ assert_instance_of(Proc, csv.send(hc).last)
+ end
+
+ # reported by Kev Jackson
+ def test_failing_to_escape_col_sep
+ assert_nothing_raised(Exception) { CSV.new(String.new, col_sep: "|") }
+ end
+
+ # reported by Chris Roos
+ def test_failing_to_reset_headers_in_rewind
+ csv = CSV.new("forename,surname", headers: true, return_headers: true)
+ csv.each {|row| assert_predicate row, :header_row?}
+ csv.rewind
+ csv.each {|row| assert_predicate row, :header_row?}
+ end
+
+ def test_gzip_reader
+ zipped = nil
+ assert_nothing_raised(NoMethodError) do
+ zipped = CSV.new(
+ Zlib::GzipReader.open(
+ File.join(File.dirname(__FILE__), "line_endings.gz")
+ )
+ )
+ end
+ assert_equal("\r\n", zipped.row_sep)
+ ensure
+ zipped.close
+ end if defined?(Zlib::GzipReader)
+
+ def test_gzip_writer
+ Tempfile.create(%w"temp .gz") {|tempfile|
+ tempfile.close
+ file = tempfile.path
+ zipped = nil
+ assert_nothing_raised(NoMethodError) do
+ zipped = CSV.new(Zlib::GzipWriter.open(file))
+ end
+ zipped << %w[one two three]
+ zipped << [1, 2, 3]
+ zipped.close
+
+ assert_include(Zlib::GzipReader.open(file) {|f| f.read},
+ $INPUT_RECORD_SEPARATOR, "@row_sep did not default")
+ }
+ end if defined?(Zlib::GzipWriter)
+
+ def test_inspect_is_smart_about_io_types
+ str = CSV.new("string,data").inspect
+ assert_include(str, "io_type:StringIO", "IO type not detected.")
+
+ str = CSV.new($stderr).inspect
+ assert_include(str, "io_type:$stderr", "IO type not detected.")
+
+ Tempfile.create(%w"temp .csv") {|tempfile|
+ tempfile.close
+ path = tempfile.path
+ File.open(path, "w") { |csv| csv << "one,two,three\n1,2,3\n" }
+ str = CSV.open(path) { |csv| csv.inspect }
+ assert_include(str, "io_type:File", "IO type not detected.")
+ }
+ end
+
+ def test_inspect_shows_key_attributes
+ str = @csv.inspect
+ %w[lineno col_sep row_sep quote_char].each do |attr_name|
+ assert_match(/\b#{attr_name}:[^\s>]+/, str)
+ end
+ end
+
+ def test_inspect_shows_headers_when_available
+ csv = CSV.new("one,two,three\n1,2,3\n", headers: true)
+ assert_include(csv.inspect, "headers:true", "Header hint not shown.")
+ csv.shift # load headers
+ assert_match(/headers:\[[^\]]+\]/, csv.inspect)
+ end
+
+ def test_inspect_encoding_is_ascii_compatible
+ csv = CSV.new("one,two,three\n1,2,3\n".encode("UTF-16BE"))
+ assert_send([Encoding, :compatible?,
+ Encoding.find("US-ASCII"), csv.inspect.encoding],
+ "inspect() was not ASCII compatible.")
+ end
+
+ def test_version
+ assert_not_nil(CSV::VERSION)
+ assert_instance_of(String, CSV::VERSION)
+ assert_predicate(CSV::VERSION, :frozen?)
+ assert_match(/\A\d\.\d\.\d\z/, CSV::VERSION)
+ end
+
+ def test_table_nil_equality
+ assert_nothing_raised(NoMethodError) { CSV.parse("test", headers: true) == nil }
+ end
+
+ # non-seekable input stream for testing https://github.com/ruby/csv/issues/44
+ class DummyIO
+ extend Forwardable
+ def_delegators :@io, :gets, :read, :pos, :eof? # no seek or rewind!
+ def initialize(data)
+ @io = StringIO.new(data)
+ end
+ end
+
+ def test_line_separator_autodetection_for_non_seekable_input_lf
+ c = CSV.new(DummyIO.new("one,two,three\nfoo,bar,baz\n"))
+ assert_equal [["one", "two", "three"], ["foo", "bar", "baz"]], c.each.to_a
+ end
+
+ def test_line_separator_autodetection_for_non_seekable_input_cr
+ c = CSV.new(DummyIO.new("one,two,three\rfoo,bar,baz\r"))
+ assert_equal [["one", "two", "three"], ["foo", "bar", "baz"]], c.each.to_a
+ end
+
+ def test_line_separator_autodetection_for_non_seekable_input_cr_lf
+ c = CSV.new(DummyIO.new("one,two,three\r\nfoo,bar,baz\r\n"))
+ assert_equal [["one", "two", "three"], ["foo", "bar", "baz"]], c.each.to_a
+ end
+
+ def test_line_separator_autodetection_for_non_seekable_input_1024_over_lf
+ table = (1..10).map { |row| (1..200).map { |col| "row#{row}col#{col}" }.to_a }.to_a
+ input = table.map { |line| line.join(",") }.join("\n")
+ c = CSV.new(DummyIO.new(input))
+ assert_equal table, c.each.to_a
+ end
+
+ def test_line_separator_autodetection_for_non_seekable_input_1024_over_cr_lf
+ table = (1..10).map { |row| (1..200).map { |col| "row#{row}col#{col}" }.to_a }.to_a
+ input = table.map { |line| line.join(",") }.join("\r\n")
+ c = CSV.new(DummyIO.new(input))
+ assert_equal table, c.each.to_a
+ end
+
+ def test_line_separator_autodetection_for_non_seekable_input_many_cr_only
+ # input with lots of CRs (to make sure no bytes are lost due to look-ahead)
+ c = CSV.new(DummyIO.new("foo\r" + "\r" * 9999 + "bar\r"))
+ assert_equal [["foo"]] + [[]] * 9999 + [["bar"]], c.each.to_a
+ end
+end
diff --git a/test/csv/test_patterns.rb b/test/csv/test_patterns.rb
new file mode 100644
index 0000000..881f03a
--- /dev/null
+++ b/test/csv/test_patterns.rb
@@ -0,0 +1,27 @@
+# frozen_string_literal: true
+
+require_relative "helper"
+
+class TestCSVPatternMatching < Test::Unit::TestCase
+
+ def test_hash
+ case CSV::Row.new(%i{A B C}, [1, 2, 3])
+ in B: b, C: c
+ assert_equal([2, 3], [b, c])
+ end
+ end
+
+ def test_hash_rest
+ case CSV::Row.new(%i{A B C}, [1, 2, 3])
+ in B: b, **rest
+ assert_equal([2, { A: 1, C: 3 }], [b, rest])
+ end
+ end
+
+ def test_array
+ case CSV::Row.new(%i{A B C}, [1, 2, 3])
+ in *, matched
+ assert_equal(3, matched)
+ end
+ end
+end
diff --git a/test/csv/test_row.rb b/test/csv/test_row.rb
new file mode 100644
index 0000000..b717945
--- /dev/null
+++ b/test/csv/test_row.rb
@@ -0,0 +1,435 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "helper"
+
+class TestCSVRow < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @row = CSV::Row.new(%w{A B C A A}, [1, 2, 3, 4])
+ end
+
+ def test_initialize
+ # basic
+ row = CSV::Row.new(%w{A B C}, [1, 2, 3])
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([["A", 1], ["B", 2], ["C", 3]], row.to_a)
+
+ # missing headers
+ row = CSV::Row.new(%w{A}, [1, 2, 3])
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([["A", 1], [nil, 2], [nil, 3]], row.to_a)
+
+ # missing fields
+ row = CSV::Row.new(%w{A B C}, [1, 2])
+ assert_not_nil(row)
+ assert_instance_of(CSV::Row, row)
+ assert_equal([["A", 1], ["B", 2], ["C", nil]], row.to_a)
+ end
+
+ def test_row_type
+ # field rows
+ row = CSV::Row.new(%w{A B C}, [1, 2, 3]) # implicit
+ assert_not_predicate(row, :header_row?)
+ assert_predicate(row, :field_row?)
+ row = CSV::Row.new(%w{A B C}, [1, 2, 3], false) # explicit
+ assert_not_predicate(row, :header_row?)
+ assert_predicate(row, :field_row?)
+
+ # header row
+ row = CSV::Row.new(%w{A B C}, [1, 2, 3], true)
+ assert_predicate(row, :header_row?)
+ assert_not_predicate(row, :field_row?)
+ end
+
+ def test_headers
+ assert_equal(%w{A B C A A}, @row.headers)
+ end
+
+ def test_field
+ # by name
+ assert_equal(2, @row.field("B"))
+ assert_equal(2, @row["B"]) # alias
+
+ # by index
+ assert_equal(3, @row.field(2))
+
+ # by range
+ assert_equal([2,3], @row.field(1..2))
+
+ # missing
+ assert_nil(@row.field("Missing"))
+ assert_nil(@row.field(10))
+
+ # minimum index
+ assert_equal(1, @row.field("A"))
+ assert_equal(1, @row.field("A", 0))
+ assert_equal(4, @row.field("A", 1))
+ assert_equal(4, @row.field("A", 2))
+ assert_equal(4, @row.field("A", 3))
+ assert_equal(nil, @row.field("A", 4))
+ assert_equal(nil, @row.field("A", 5))
+ end
+
+ def test_fetch
+ # only by name
+ assert_equal(2, @row.fetch('B'))
+
+ # missing header raises KeyError
+ assert_raise KeyError do
+ @row.fetch('foo')
+ end
+
+ # missing header yields itself to block
+ assert_equal 'bar', @row.fetch('foo') { |header|
+ header == 'foo' ? 'bar' : false }
+
+ # missing header returns the given default value
+ assert_equal 'bar', @row.fetch('foo', 'bar')
+
+ # more than one vararg raises ArgumentError
+ assert_raise ArgumentError do
+ @row.fetch('foo', 'bar', 'baz')
+ end
+ end
+
+ def test_has_key?
+ assert_equal(true, @row.has_key?('B'))
+ assert_equal(false, @row.has_key?('foo'))
+
+ # aliases
+ assert_equal(true, @row.header?('B'))
+ assert_equal(false, @row.header?('foo'))
+
+ assert_equal(true, @row.include?('B'))
+ assert_equal(false, @row.include?('foo'))
+
+ assert_equal(true, @row.member?('B'))
+ assert_equal(false, @row.member?('foo'))
+
+ assert_equal(true, @row.key?('B'))
+ assert_equal(false, @row.key?('foo'))
+ end
+
+ def test_set_field
+ # set field by name
+ assert_equal(100, @row["A"] = 100)
+
+ # set field by index
+ assert_equal(300, @row[3] = 300)
+
+ # set field by name and minimum index
+ assert_equal([:a, :b, :c], @row["A", 4] = [:a, :b, :c])
+
+ # verify the changes
+ assert_equal( [ ["A", 100],
+ ["B", 2],
+ ["C", 3],
+ ["A", 300],
+ ["A", [:a, :b, :c]] ], @row.to_a )
+
+ # assigning an index past the end
+ assert_equal("End", @row[10] = "End")
+ assert_equal( [ ["A", 100],
+ ["B", 2],
+ ["C", 3],
+ ["A", 300],
+ ["A", [:a, :b, :c]],
+ [nil, nil],
+ [nil, nil],
+ [nil, nil],
+ [nil, nil],
+ [nil, nil],
+ [nil, "End"] ], @row.to_a )
+
+ # assigning a new field by header
+ assert_equal("New", @row[:new] = "New")
+ assert_equal( [ ["A", 100],
+ ["B", 2],
+ ["C", 3],
+ ["A", 300],
+ ["A", [:a, :b, :c]],
+ [nil, nil],
+ [nil, nil],
+ [nil, nil],
+ [nil, nil],
+ [nil, nil],
+ [nil, "End"],
+ [:new, "New"] ], @row.to_a )
+ end
+
+ def test_append
+ # add a value
+ assert_equal(@row, @row << "Value")
+ assert_equal( [ ["A", 1],
+ ["B", 2],
+ ["C", 3],
+ ["A", 4],
+ ["A", nil],
+ [nil, "Value"] ], @row.to_a )
+
+ # add a pair
+ assert_equal(@row, @row << %w{Header Field})
+ assert_equal( [ ["A", 1],
+ ["B", 2],
+ ["C", 3],
+ ["A", 4],
+ ["A", nil],
+ [nil, "Value"],
+ %w{Header Field} ], @row.to_a )
+
+ # a pair with Hash syntax
+ assert_equal(@row, @row << {key: :value})
+ assert_equal( [ ["A", 1],
+ ["B", 2],
+ ["C", 3],
+ ["A", 4],
+ ["A", nil],
+ [nil, "Value"],
+ %w{Header Field},
+ [:key, :value] ], @row.to_a )
+
+ # multiple fields at once
+ assert_equal(@row, @row.push(100, 200, [:last, 300]))
+ assert_equal( [ ["A", 1],
+ ["B", 2],
+ ["C", 3],
+ ["A", 4],
+ ["A", nil],
+ [nil, "Value"],
+ %w{Header Field},
+ [:key, :value],
+ [nil, 100],
+ [nil, 200],
+ [:last, 300] ], @row.to_a )
+ end
+
+ def test_delete
+ # by index
+ assert_equal(["B", 2], @row.delete(1))
+
+ # by header
+ assert_equal(["C", 3], @row.delete("C"))
+
+ end
+
+ def test_delete_if
+ assert_equal(@row, @row.delete_if { |h, f| h == "A" and not f.nil? })
+ assert_equal([["B", 2], ["C", 3], ["A", nil]], @row.to_a)
+ end
+
+ def test_delete_if_without_block
+ enum = @row.delete_if
+ assert_instance_of(Enumerator, enum)
+ assert_equal(@row.size, enum.size)
+
+ assert_equal(@row, enum.each { |h, f| h == "A" and not f.nil? })
+ assert_equal([["B", 2], ["C", 3], ["A", nil]], @row.to_a)
+ end
+
+ def test_fields
+ # all fields
+ assert_equal([1, 2, 3, 4, nil], @row.fields)
+
+ # by header
+ assert_equal([1, 3], @row.fields("A", "C"))
+
+ # by index
+ assert_equal([2, 3, nil], @row.fields(1, 2, 10))
+
+ # by both
+ assert_equal([2, 3, 4], @row.fields("B", "C", 3))
+
+ # with minimum indices
+ assert_equal([2, 3, 4], @row.fields("B", "C", ["A", 3]))
+
+ # by header range
+ assert_equal([2, 3], @row.values_at("B".."C"))
+ end
+
+ def test_index
+ # basic usage
+ assert_equal(0, @row.index("A"))
+ assert_equal(1, @row.index("B"))
+ assert_equal(2, @row.index("C"))
+ assert_equal(nil, @row.index("Z"))
+
+ # with minimum index
+ assert_equal(0, @row.index("A"))
+ assert_equal(0, @row.index("A", 0))
+ assert_equal(3, @row.index("A", 1))
+ assert_equal(3, @row.index("A", 2))
+ assert_equal(3, @row.index("A", 3))
+ assert_equal(4, @row.index("A", 4))
+ assert_equal(nil, @row.index("A", 5))
+ end
+
+ def test_queries
+ # fields
+ assert(@row.field?(4))
+ assert(@row.field?(nil))
+ assert(!@row.field?(10))
+ end
+
+ def test_each
+ # array style
+ ary = @row.to_a
+ @row.each do |pair|
+ assert_equal(ary.first.first, pair.first)
+ assert_equal(ary.shift.last, pair.last)
+ end
+
+ # hash style
+ ary = @row.to_a
+ @row.each do |header, field|
+ assert_equal(ary.first.first, header)
+ assert_equal(ary.shift.last, field)
+ end
+
+ # verify that we can chain the call
+ assert_equal(@row, @row.each { })
+
+ # without block
+ ary = @row.to_a
+ enum = @row.each
+ assert_instance_of(Enumerator, enum)
+ assert_equal(@row.size, enum.size)
+ enum.each do |pair|
+ assert_equal(ary.first.first, pair.first)
+ assert_equal(ary.shift.last, pair.last)
+ end
+ end
+
+ def test_each_pair
+ assert_equal([
+ ["A", 1],
+ ["B", 2],
+ ["C", 3],
+ ["A", 4],
+ ["A", nil],
+ ],
+ @row.each_pair.to_a)
+ end
+
+ def test_enumerable
+ assert_equal( [["A", 1], ["A", 4], ["A", nil]],
+ @row.select { |pair| pair.first == "A" } )
+
+ assert_equal(10, @row.inject(0) { |sum, (_, n)| sum + (n || 0) })
+ end
+
+ def test_to_a
+ row = CSV::Row.new(%w{A B C}, [1, 2, 3]).to_a
+ assert_instance_of(Array, row)
+ row.each do |pair|
+ assert_instance_of(Array, pair)
+ assert_equal(2, pair.size)
+ end
+ assert_equal([["A", 1], ["B", 2], ["C", 3]], row)
+ end
+
+ def test_to_hash
+ hash = @row.to_hash
+ assert_equal({"A" => @row["A"], "B" => @row["B"], "C" => @row["C"]}, hash)
+ hash.keys.each_with_index do |string_key, h|
+ assert_predicate(string_key, :frozen?)
+ assert_same(string_key, @row.headers[h])
+ end
+ end
+
+ def test_to_csv
+ # normal conversion
+ assert_equal("1,2,3,4,\n", @row.to_csv)
+ assert_equal("1,2,3,4,\n", @row.to_s) # alias
+
+ # with options
+ assert_equal( "1|2|3|4|\r\n",
+ @row.to_csv(col_sep: "|", row_sep: "\r\n") )
+ end
+
+ def test_array_delegation
+ assert_not_empty(@row, "Row was empty.")
+
+ assert_equal([@row.headers.size, @row.fields.size].max, @row.size)
+ end
+
+ def test_inspect_shows_header_field_pairs
+ str = @row.inspect
+ @row.each do |header, field|
+ assert_include(str, "#{header.inspect}:#{field.inspect}",
+ "Header field pair not found.")
+ end
+ end
+
+ def test_inspect_encoding_is_ascii_compatible
+ assert_send([Encoding, :compatible?,
+ Encoding.find("US-ASCII"),
+ @row.inspect.encoding],
+ "inspect() was not ASCII compatible.")
+ end
+
+ def test_inspect_shows_symbol_headers_as_bare_attributes
+ str = CSV::Row.new(@row.headers.map { |h| h.to_sym }, @row.fields).inspect
+ @row.each do |header, field|
+ assert_include(str, "#{header}:#{field.inspect}",
+ "Header field pair not found.")
+ end
+ end
+
+ def test_can_be_compared_with_other_classes
+ assert_not_nil(CSV::Row.new([ ], [ ]), "The row was nil")
+ end
+
+ def test_can_be_compared_when_not_a_row
+ r = @row == []
+ assert_equal false, r
+ end
+
+ def test_dig_by_index
+ assert_equal(2, @row.dig(1))
+
+ assert_nil(@row.dig(100))
+ end
+
+ def test_dig_by_header
+ assert_equal(2, @row.dig("B"))
+
+ assert_nil(@row.dig("Missing"))
+ end
+
+ def test_dig_cell
+ row = CSV::Row.new(%w{A}, [["foo", ["bar", ["baz"]]]])
+
+ assert_equal("foo", row.dig(0, 0))
+ assert_equal("bar", row.dig(0, 1, 0))
+
+ assert_equal("foo", row.dig("A", 0))
+ assert_equal("bar", row.dig("A", 1, 0))
+ end
+
+ def test_dig_cell_no_dig
+ row = CSV::Row.new(%w{A}, ["foo"])
+
+ assert_raise(TypeError) do
+ row.dig(0, 0)
+ end
+ assert_raise(TypeError) do
+ row.dig("A", 0)
+ end
+ end
+
+ def test_dup
+ row = CSV::Row.new(["A"], ["foo"])
+ dupped_row = row.dup
+ dupped_row["A"] = "bar"
+ assert_equal(["foo", "bar"],
+ [row["A"], dupped_row["A"]])
+ dupped_row.delete("A")
+ assert_equal(["foo", nil],
+ [row["A"], dupped_row["A"]])
+ end
+end
diff --git a/test/csv/test_table.rb b/test/csv/test_table.rb
new file mode 100644
index 0000000..e8ab740
--- /dev/null
+++ b/test/csv/test_table.rb
@@ -0,0 +1,691 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "helper"
+
+class TestCSVTable < Test::Unit::TestCase
+ extend DifferentOFS
+
+ def setup
+ super
+ @rows = [ CSV::Row.new(%w{A B C}, [1, 2, 3]),
+ CSV::Row.new(%w{A B C}, [4, 5, 6]),
+ CSV::Row.new(%w{A B C}, [7, 8, 9]) ]
+ @table = CSV::Table.new(@rows)
+
+ @header_table = CSV::Table.new(
+ [CSV::Row.new(%w{A B C}, %w{A B C}, true)] + @rows
+ )
+
+ @header_only_table = CSV::Table.new([], headers: %w{A B C})
+ end
+
+ def test_initialize
+ assert_not_nil(@table)
+ assert_instance_of(CSV::Table, @table)
+ end
+
+ def test_modes
+ assert_equal(:col_or_row, @table.mode)
+
+ # non-destructive changes, intended for one shot calls
+ cols = @table.by_col
+ assert_equal(:col_or_row, @table.mode)
+ assert_equal(:col, cols.mode)
+ assert_equal(@table, cols)
+
+ rows = @table.by_row
+ assert_equal(:col_or_row, @table.mode)
+ assert_equal(:row, rows.mode)
+ assert_equal(@table, rows)
+
+ col_or_row = rows.by_col_or_row
+ assert_equal(:row, rows.mode)
+ assert_equal(:col_or_row, col_or_row.mode)
+ assert_equal(@table, col_or_row)
+
+ # destructive mode changing calls
+ assert_equal(@table, @table.by_row!)
+ assert_equal(:row, @table.mode)
+ assert_equal(@table, @table.by_col_or_row!)
+ assert_equal(:col_or_row, @table.mode)
+ end
+
+ def test_headers
+ assert_equal(@rows.first.headers, @table.headers)
+ end
+
+ def test_headers_empty
+ t = CSV::Table.new([])
+ assert_equal Array.new, t.headers
+ end
+
+ def test_headers_only
+ assert_equal(%w[A B C], @header_only_table.headers)
+ end
+
+ def test_headers_modified_by_row
+ table = CSV::Table.new([], headers: ["A", "B"])
+ table << ["a", "b"]
+ table.first << {"C" => "c"}
+ assert_equal(["A", "B", "C"], table.headers)
+ end
+
+ def test_index
+ ##################
+ ### Mixed Mode ###
+ ##################
+ # by row
+ @rows.each_index { |i| assert_equal(@rows[i], @table[i]) }
+ assert_equal(nil, @table[100]) # empty row
+
+ # by row with Range
+ assert_equal([@table[1], @table[2]], @table[1..2])
+
+ # by col
+ @rows.first.headers.each do |header|
+ assert_equal(@rows.map { |row| row[header] }, @table[header])
+ end
+ assert_equal([nil] * @rows.size, @table["Z"]) # empty col
+
+ # by cell, row then col
+ assert_equal(2, @table[0][1])
+ assert_equal(6, @table[1]["C"])
+
+ # by cell, col then row
+ assert_equal(5, @table["B"][1])
+ assert_equal(9, @table["C"][2])
+
+ # with headers (by col)
+ assert_equal(["B", 2, 5, 8], @header_table["B"])
+
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ assert_equal([2, 5, 8], @table[1])
+ assert_equal([2, 5, 8], @table["B"])
+
+ ################
+ ### Row Mode ###
+ ################
+ @table.by_row!
+
+ assert_equal(@rows[1], @table[1])
+ assert_raise(TypeError) { @table["B"] }
+
+ ############################
+ ### One Shot Mode Change ###
+ ############################
+ assert_equal(@rows[1], @table[1])
+ assert_equal([2, 5, 8], @table.by_col[1])
+ assert_equal(@rows[1], @table[1])
+ end
+
+ def test_set_row_or_column
+ ##################
+ ### Mixed Mode ###
+ ##################
+ # set row
+ @table[2] = [10, 11, 12]
+ assert_equal([%w[A B C], [1, 2, 3], [4, 5, 6], [10, 11, 12]], @table.to_a)
+
+ @table[3] = CSV::Row.new(%w[A B C], [13, 14, 15])
+ assert_equal( [%w[A B C], [1, 2, 3], [4, 5, 6], [10, 11, 12], [13, 14, 15]],
+ @table.to_a )
+
+ # set col
+ @table["Type"] = "data"
+ assert_equal( [ %w[A B C Type],
+ [1, 2, 3, "data"],
+ [4, 5, 6, "data"],
+ [10, 11, 12, "data"],
+ [13, 14, 15, "data"] ],
+ @table.to_a )
+
+ @table["Index"] = [1, 2, 3]
+ assert_equal( [ %w[A B C Type Index],
+ [1, 2, 3, "data", 1],
+ [4, 5, 6, "data", 2],
+ [10, 11, 12, "data", 3],
+ [13, 14, 15, "data", nil] ],
+ @table.to_a )
+
+ @table["B"] = [100, 200]
+ assert_equal( [ %w[A B C Type Index],
+ [1, 100, 3, "data", 1],
+ [4, 200, 6, "data", 2],
+ [10, nil, 12, "data", 3],
+ [13, nil, 15, "data", nil] ],
+ @table.to_a )
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+A,B,C,Type,Index
+1,100,3,data,1
+4,200,6,data,2
+10,,12,data,3
+13,,15,data,
+ CSV
+
+ # with headers
+ @header_table["Type"] = "data"
+ assert_equal(%w[Type data data data], @header_table["Type"])
+
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ @table[1] = [2, 5, 11, 14]
+ assert_equal( [ %w[A B C Type Index],
+ [1, 2, 3, "data", 1],
+ [4, 5, 6, "data", 2],
+ [10, 11, 12, "data", 3],
+ [13, 14, 15, "data", nil] ],
+ @table.to_a )
+
+ @table["Extra"] = "new stuff"
+ assert_equal( [ %w[A B C Type Index Extra],
+ [1, 2, 3, "data", 1, "new stuff"],
+ [4, 5, 6, "data", 2, "new stuff"],
+ [10, 11, 12, "data", 3, "new stuff"],
+ [13, 14, 15, "data", nil, "new stuff"] ],
+ @table.to_a )
+
+ ################
+ ### Row Mode ###
+ ################
+ @table.by_row!
+
+ @table[1] = (1..6).to_a
+ assert_equal( [ %w[A B C Type Index Extra],
+ [1, 2, 3, "data", 1, "new stuff"],
+ [1, 2, 3, 4, 5, 6],
+ [10, 11, 12, "data", 3, "new stuff"],
+ [13, 14, 15, "data", nil, "new stuff"] ],
+ @table.to_a )
+
+ assert_raise(TypeError) { @table["Extra"] = nil }
+ end
+
+ def test_set_by_col_with_header_row
+ r = [ CSV::Row.new(%w{X Y Z}, [97, 98, 99], true) ]
+ t = CSV::Table.new(r)
+ t.by_col!
+ t['A'] = [42]
+ assert_equal(['A'], t['A'])
+ end
+
+ def test_each
+ ######################
+ ### Mixed/Row Mode ###
+ ######################
+ i = 0
+ @table.each do |row|
+ assert_equal(@rows[i], row)
+ i += 1
+ end
+
+ # verify that we can chain the call
+ assert_equal(@table, @table.each { })
+
+ # without block
+ enum = @table.each
+ assert_instance_of(Enumerator, enum)
+ assert_equal(@table.size, enum.size)
+
+ i = 0
+ enum.each do |row|
+ assert_equal(@rows[i], row)
+ i += 1
+ end
+
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ headers = @table.headers
+ @table.each do |header, column|
+ assert_equal(headers.shift, header)
+ assert_equal(@table[header], column)
+ end
+
+ # without block
+ enum = @table.each
+ assert_instance_of(Enumerator, enum)
+ assert_equal(@table.headers.size, enum.size)
+
+ headers = @table.headers
+ enum.each do |header, column|
+ assert_equal(headers.shift, header)
+ assert_equal(@table[header], column)
+ end
+
+ ############################
+ ### One Shot Mode Change ###
+ ############################
+ @table.by_col_or_row!
+
+ @table.each { |row| assert_instance_of(CSV::Row, row) }
+ @table.by_col.each { |tuple| assert_instance_of(Array, tuple) }
+ @table.each { |row| assert_instance_of(CSV::Row, row) }
+ end
+
+ def test_each_by_col_duplicated_headers
+ table = CSV.parse(<<-CSV, headers: true)
+a,a,,,b
+1,2,3,4,5
+11,12,13,14,15
+ CSV
+ assert_equal([
+ ["a", ["1", "11"]],
+ ["a", ["2", "12"]],
+ [nil, ["3", "13"]],
+ [nil, ["4", "14"]],
+ ["b", ["5", "15"]],
+ ],
+ table.by_col.each.to_a)
+ end
+
+ def test_each_split
+ yielded_values = []
+ @table.each do |column1, column2, column3|
+ yielded_values << [column1, column2, column3]
+ end
+ assert_equal(@rows.collect(&:to_a),
+ yielded_values)
+ end
+
+ def test_enumerable
+ assert_equal( @rows.values_at(0, 2),
+ @table.select { |row| (row["B"] % 2).zero? } )
+
+ assert_equal(@rows[1], @table.find { |row| row["C"] > 5 })
+ end
+
+ def test_to_a
+ assert_equal([%w[A B C], [1, 2, 3], [4, 5, 6], [7, 8, 9]], @table.to_a)
+
+ # with headers
+ assert_equal( [%w[A B C], [1, 2, 3], [4, 5, 6], [7, 8, 9]],
+ @header_table.to_a )
+ end
+
+ def test_to_csv
+ csv = <<-CSV
+A,B,C
+1,2,3
+4,5,6
+7,8,9
+ CSV
+
+ # normal conversion
+ assert_equal(csv, @table.to_csv)
+ assert_equal(csv, @table.to_s) # alias
+
+ # with options
+ assert_equal( csv.gsub(",", "|").gsub("\n", "\r\n"),
+ @table.to_csv(col_sep: "|", row_sep: "\r\n") )
+ assert_equal( csv.lines.to_a[1..-1].join(''),
+ @table.to_csv(:write_headers => false) )
+
+ # with headers
+ assert_equal(csv, @header_table.to_csv)
+ end
+
+ def test_to_csv_limit_positive
+ assert_equal(<<-CSV, @table.to_csv(limit: 2))
+A,B,C
+1,2,3
+4,5,6
+ CSV
+ end
+
+ def test_to_csv_limit_positive_over
+ assert_equal(<<-CSV, @table.to_csv(limit: 5))
+A,B,C
+1,2,3
+4,5,6
+7,8,9
+ CSV
+ end
+
+ def test_to_csv_limit_zero
+ assert_equal(<<-CSV, @table.to_csv(limit: 0))
+A,B,C
+ CSV
+ end
+
+ def test_to_csv_limit_negative
+ assert_equal(<<-CSV, @table.to_csv(limit: -2))
+A,B,C
+1,2,3
+4,5,6
+ CSV
+ end
+
+ def test_to_csv_limit_negative_over
+ assert_equal(<<-CSV, @table.to_csv(limit: -5))
+A,B,C
+ CSV
+ end
+
+ def test_append
+ # verify that we can chain the call
+ assert_equal(@table, @table << [10, 11, 12])
+
+ # Array append
+ assert_equal(CSV::Row.new(%w[A B C], [10, 11, 12]), @table[-1])
+
+ # Row append
+ assert_equal(@table, @table << CSV::Row.new(%w[A B C], [13, 14, 15]))
+ assert_equal(CSV::Row.new(%w[A B C], [13, 14, 15]), @table[-1])
+ end
+
+ def test_delete_mixed_one
+ ##################
+ ### Mixed Mode ###
+ ##################
+ # delete a row
+ assert_equal(@rows[1], @table.delete(1))
+
+ # delete a col
+ assert_equal(@rows.map { |row| row["A"] }, @table.delete("A"))
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+B,C
+2,3
+8,9
+ CSV
+ end
+
+ def test_delete_mixed_multiple
+ ##################
+ ### Mixed Mode ###
+ ##################
+ # delete row and col
+ second_row = @rows[1]
+ a_col = @rows.map { |row| row["A"] }
+ a_col_without_second_row = a_col[0..0] + a_col[2..-1]
+ assert_equal([
+ second_row,
+ a_col_without_second_row,
+ ],
+ @table.delete(1, "A"))
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+B,C
+2,3
+8,9
+ CSV
+ end
+
+ def test_delete_column
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ assert_equal(@rows.map { |row| row[0] }, @table.delete(0))
+ assert_equal(@rows.map { |row| row["C"] }, @table.delete("C"))
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+B
+2
+5
+8
+ CSV
+ end
+
+ def test_delete_row
+ ################
+ ### Row Mode ###
+ ################
+ @table.by_row!
+
+ assert_equal(@rows[1], @table.delete(1))
+ assert_raise(TypeError) { @table.delete("C") }
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+A,B,C
+1,2,3
+7,8,9
+ CSV
+ end
+
+ def test_delete_with_blank_rows
+ data = "col1,col2\nra1,ra2\n\nrb1,rb2"
+ table = CSV.parse(data, :headers => true)
+ assert_equal(["ra2", nil, "rb2"], table.delete("col2"))
+ end
+
+ def test_delete_if_row
+ ######################
+ ### Mixed/Row Mode ###
+ ######################
+ # verify that we can chain the call
+ assert_equal(@table, @table.delete_if { |row| (row["B"] % 2).zero? })
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+A,B,C
+4,5,6
+ CSV
+ end
+
+ def test_delete_if_row_without_block
+ ######################
+ ### Mixed/Row Mode ###
+ ######################
+ enum = @table.delete_if
+ assert_instance_of(Enumerator, enum)
+ assert_equal(@table.size, enum.size)
+
+ # verify that we can chain the call
+ assert_equal(@table, enum.each { |row| (row["B"] % 2).zero? })
+
+ # verify resulting table
+ assert_equal(<<-CSV, @table.to_csv)
+A,B,C
+4,5,6
+ CSV
+ end
+
+ def test_delete_if_column
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ assert_equal(@table, @table.delete_if { |h, v| h > "A" })
+ assert_equal(<<-CSV, @table.to_csv)
+A
+1
+4
+7
+ CSV
+ end
+
+ def test_delete_if_column_without_block
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ enum = @table.delete_if
+ assert_instance_of(Enumerator, enum)
+ assert_equal(@table.headers.size, enum.size)
+
+ assert_equal(@table, enum.each { |h, v| h > "A" })
+ assert_equal(<<-CSV, @table.to_csv)
+A
+1
+4
+7
+ CSV
+ end
+
+ def test_delete_headers_only
+ ###################
+ ### Column Mode ###
+ ###################
+ @header_only_table.by_col!
+
+ # delete by index
+ assert_equal([], @header_only_table.delete(0))
+ assert_equal(%w[B C], @header_only_table.headers)
+
+ # delete by header
+ assert_equal([], @header_only_table.delete("C"))
+ assert_equal(%w[B], @header_only_table.headers)
+ end
+
+ def test_values_at
+ ##################
+ ### Mixed Mode ###
+ ##################
+ # rows
+ assert_equal(@rows.values_at(0, 2), @table.values_at(0, 2))
+ assert_equal(@rows.values_at(1..2), @table.values_at(1..2))
+
+ # cols
+ assert_equal([[1, 3], [4, 6], [7, 9]], @table.values_at("A", "C"))
+ assert_equal([[2, 3], [5, 6], [8, 9]], @table.values_at("B".."C"))
+
+ ###################
+ ### Column Mode ###
+ ###################
+ @table.by_col!
+
+ assert_equal([[1, 3], [4, 6], [7, 9]], @table.values_at(0, 2))
+ assert_equal([[1, 3], [4, 6], [7, 9]], @table.values_at("A", "C"))
+
+ ################
+ ### Row Mode ###
+ ################
+ @table.by_row!
+
+ assert_equal(@rows.values_at(0, 2), @table.values_at(0, 2))
+ assert_raise(TypeError) { @table.values_at("A", "C") }
+
+ ############################
+ ### One Shot Mode Change ###
+ ############################
+ assert_equal(@rows.values_at(0, 2), @table.values_at(0, 2))
+ assert_equal([[1, 3], [4, 6], [7, 9]], @table.by_col.values_at(0, 2))
+ assert_equal(@rows.values_at(0, 2), @table.values_at(0, 2))
+ end
+
+ def test_array_delegation
+ assert_not_empty(@table, "Table was empty.")
+
+ assert_equal(@rows.size, @table.size)
+ end
+
+ def test_inspect_shows_current_mode
+ str = @table.inspect
+ assert_include(str, "mode:#{@table.mode}", "Mode not shown.")
+
+ @table.by_col!
+ str = @table.inspect
+ assert_include(str, "mode:#{@table.mode}", "Mode not shown.")
+ end
+
+ def test_inspect_encoding_is_ascii_compatible
+ assert_send([Encoding, :compatible?,
+ Encoding.find("US-ASCII"),
+ @table.inspect.encoding],
+ "inspect() was not ASCII compatible." )
+ end
+
+ def test_inspect_with_rows
+ additional_rows = [ CSV::Row.new(%w{A B C}, [101, 102, 103]),
+ CSV::Row.new(%w{A B C}, [104, 105, 106]),
+ CSV::Row.new(%w{A B C}, [107, 108, 109]) ]
+ table = CSV::Table.new(@rows + additional_rows)
+ str_table = table.inspect
+
+ assert_equal(<<-CSV, str_table)
+#<CSV::Table mode:col_or_row row_count:7>
+A,B,C
+1,2,3
+4,5,6
+7,8,9
+101,102,103
+104,105,106
+ CSV
+ end
+
+ def test_dig_mixed
+ # by row
+ assert_equal(@rows[0], @table.dig(0))
+ assert_nil(@table.dig(100)) # empty row
+
+ # by col
+ assert_equal([2, 5, 8], @table.dig("B"))
+ assert_equal([nil] * @rows.size, @table.dig("Z")) # empty col
+
+ # by row then col
+ assert_equal(2, @table.dig(0, 1))
+ assert_equal(6, @table.dig(1, "C"))
+
+ # by col then row
+ assert_equal(5, @table.dig("B", 1))
+ assert_equal(9, @table.dig("C", 2))
+ end
+
+ def test_dig_by_column
+ @table.by_col!
+
+ assert_equal([2, 5, 8], @table.dig(1))
+ assert_equal([2, 5, 8], @table.dig("B"))
+
+ # by col then row
+ assert_equal(5, @table.dig("B", 1))
+ assert_equal(9, @table.dig("C", 2))
+ end
+
+ def test_dig_by_row
+ @table.by_row!
+
+ assert_equal(@rows[1], @table.dig(1))
+ assert_raise(TypeError) { @table.dig("B") }
+
+ # by row then col
+ assert_equal(2, @table.dig(0, 1))
+ assert_equal(6, @table.dig(1, "C"))
+ end
+
+ def test_dig_cell
+ table = CSV::Table.new([CSV::Row.new(["A"], [["foo", ["bar", ["baz"]]]])])
+
+ # by row, col then cell
+ assert_equal("foo", table.dig(0, "A", 0))
+ assert_equal(["baz"], table.dig(0, "A", 1, 1))
+
+ # by col, row then cell
+ assert_equal("foo", table.dig("A", 0, 0))
+ assert_equal(["baz"], table.dig("A", 0, 1, 1))
+ end
+
+ def test_dig_cell_no_dig
+ table = CSV::Table.new([CSV::Row.new(["A"], ["foo"])])
+
+ # by row, col then cell
+ assert_raise(TypeError) do
+ table.dig(0, "A", 0)
+ end
+
+ # by col, row then cell
+ assert_raise(TypeError) do
+ table.dig("A", 0, 0)
+ end
+ end
+end
diff --git a/test/csv/write/test_converters.rb b/test/csv/write/test_converters.rb
new file mode 100644
index 0000000..0e0080b
--- /dev/null
+++ b/test/csv/write/test_converters.rb
@@ -0,0 +1,53 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+module TestCSVWriteConverters
+ def test_one
+ assert_equal(%Q[=a,=b,=c\n],
+ generate_line(["a", "b", "c"],
+ write_converters: ->(value) {"=" + value}))
+ end
+
+ def test_multiple
+ assert_equal(%Q[=a_,=b_,=c_\n],
+ generate_line(["a", "b", "c"],
+ write_converters: [
+ ->(value) {"=" + value},
+ ->(value) {value + "_"},
+ ]))
+ end
+
+ def test_nil_value
+ assert_equal(%Q[a,NaN,29\n],
+ generate_line(["a", nil, 29],
+ write_nil_value: "NaN"))
+ end
+
+ def test_empty_value
+ assert_equal(%Q[a,,29\n],
+ generate_line(["a", "", 29],
+ write_empty_value: nil))
+ end
+end
+
+class TestCSVWriteConvertersGenerateLine < Test::Unit::TestCase
+ include TestCSVWriteConverters
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate_line(row, **kwargs)
+ end
+end
+
+class TestCSVWriteConvertersGenerate < Test::Unit::TestCase
+ include TestCSVWriteConverters
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate(**kwargs) do |csv|
+ csv << row
+ end
+ end
+end
diff --git a/test/csv/write/test_force_quotes.rb b/test/csv/write/test_force_quotes.rb
new file mode 100644
index 0000000..622dcb0
--- /dev/null
+++ b/test/csv/write/test_force_quotes.rb
@@ -0,0 +1,78 @@
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+module TestCSVWriteForceQuotes
+ def test_default
+ assert_equal(%Q[1,2,3#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["1", "2", "3"]))
+ end
+
+ def test_true
+ assert_equal(%Q["1","2","3"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["1", "2", "3"],
+ force_quotes: true))
+ end
+
+ def test_false
+ assert_equal(%Q[1,2,3#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["1", "2", "3"],
+ force_quotes: false))
+ end
+
+ def test_field_name
+ assert_equal(%Q["1",2,"3"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["1", "2", "3"],
+ headers: ["a", "b", "c"],
+ force_quotes: ["a", :c]))
+ end
+
+ def test_field_name_without_headers
+ force_quotes = ["a", "c"]
+ error = assert_raise(ArgumentError) do
+ generate_line(["1", "2", "3"],
+ force_quotes: force_quotes)
+ end
+ assert_equal(":headers is required when you use field name " +
+ "in :force_quotes: " +
+ "#{force_quotes.first.inspect}: #{force_quotes.inspect}",
+ error.message)
+ end
+
+ def test_field_index
+ assert_equal(%Q["1",2,"3"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["1", "2", "3"],
+ force_quotes: [0, 2]))
+ end
+
+ def test_field_unknown
+ force_quotes = [1.1]
+ error = assert_raise(ArgumentError) do
+ generate_line(["1", "2", "3"],
+ force_quotes: force_quotes)
+ end
+ assert_equal(":force_quotes element must be field index or field name: " +
+ "#{force_quotes.first.inspect}: #{force_quotes.inspect}",
+ error.message)
+ end
+end
+
+class TestCSVWriteForceQuotesGenerateLine < Test::Unit::TestCase
+ include TestCSVWriteForceQuotes
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate_line(row, **kwargs)
+ end
+end
+
+class TestCSVWriteForceQuotesGenerate < Test::Unit::TestCase
+ include TestCSVWriteForceQuotes
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate(**kwargs) do |csv|
+ csv << row
+ end
+ end
+end
diff --git a/test/csv/write/test_general.rb b/test/csv/write/test_general.rb
new file mode 100644
index 0000000..677119e
--- /dev/null
+++ b/test/csv/write/test_general.rb
@@ -0,0 +1,246 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+module TestCSVWriteGeneral
+ include Helper
+
+ def test_tab
+ assert_equal("\t#{$INPUT_RECORD_SEPARATOR}",
+ generate_line(["\t"]))
+ end
+
+ def test_quote_character
+ assert_equal(%Q[foo,"""",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q["], "baz"]))
+ end
+
+ def test_quote_character_double
+ assert_equal(%Q[foo,"""""",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q[""], "baz"]))
+ end
+
+ def test_quote
+ assert_equal(%Q[foo,"""bar""",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q["bar"], "baz"]))
+ end
+
+ def test_quote_lf
+ assert_equal(%Q["""\n","""\n"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([%Q["\n], %Q["\n]]))
+ end
+
+ def test_quote_cr
+ assert_equal(%Q["""\r","""\r"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([%Q["\r], %Q["\r]]))
+ end
+
+ def test_quote_last
+ assert_equal(%Q[foo,"bar"""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q[bar"]]))
+ end
+
+ def test_quote_lf_last
+ assert_equal(%Q[foo,"\nbar"""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q[\nbar"]]))
+ end
+
+ def test_quote_lf_value_lf
+ assert_equal(%Q[foo,"""\nbar\n"""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q["\nbar\n"]]))
+ end
+
+ def test_quote_lf_value_lf_nil
+ assert_equal(%Q[foo,"""\nbar\n""",#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", %Q["\nbar\n"], nil]))
+ end
+
+ def test_cr
+ assert_equal(%Q[foo,"\r",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "\r", "baz"]))
+ end
+
+ def test_lf
+ assert_equal(%Q[foo,"\n",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "\n", "baz"]))
+ end
+
+ def test_cr_lf
+ assert_equal(%Q[foo,"\r\n",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "\r\n", "baz"]))
+ end
+
+ def test_cr_dot_lf
+ assert_equal(%Q[foo,"\r.\n",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "\r.\n", "baz"]))
+ end
+
+ def test_cr_lf_cr
+ assert_equal(%Q[foo,"\r\n\r",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "\r\n\r", "baz"]))
+ end
+
+ def test_cr_lf_lf
+ assert_equal(%Q[foo,"\r\n\n",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "\r\n\n", "baz"]))
+ end
+
+ def test_cr_lf_comma
+ assert_equal(%Q["\r\n,"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["\r\n,"]))
+ end
+
+ def test_cr_lf_comma_nil
+ assert_equal(%Q["\r\n,",#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["\r\n,", nil]))
+ end
+
+ def test_comma
+ assert_equal(%Q[","#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([","]))
+ end
+
+ def test_comma_double
+ assert_equal(%Q[",",","#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([",", ","]))
+ end
+
+ def test_comma_and_value
+ assert_equal(%Q[foo,"foo,bar",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "foo,bar", "baz"]))
+ end
+
+ def test_one_element
+ assert_equal(%Q[foo#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo"]))
+ end
+
+ def test_nil_values_only
+ assert_equal(%Q[,,#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([nil, nil, nil]))
+ end
+
+ def test_nil_double_only
+ assert_equal(%Q[,#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([nil, nil]))
+ end
+
+ def test_nil_values
+ assert_equal(%Q[foo,,,#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", nil, nil, nil]))
+ end
+
+ def test_nil_value_first
+ assert_equal(%Q[,foo,baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([nil, "foo", "baz"]))
+ end
+
+ def test_nil_value_middle
+ assert_equal(%Q[foo,,baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", nil, "baz"]))
+ end
+
+ def test_nil_value_last
+ assert_equal(%Q[foo,baz,#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "baz", nil]))
+ end
+
+ def test_nil_empty
+ assert_equal(%Q[,""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([nil, ""]))
+ end
+
+ def test_nil_cr
+ assert_equal(%Q[,"\r"#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([nil, "\r"]))
+ end
+
+ def test_values
+ assert_equal(%Q[foo,bar#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "bar"]))
+ end
+
+ def test_semi_colon
+ assert_equal(%Q[;#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([";"]))
+ end
+
+ def test_semi_colon_values
+ assert_equal(%Q[;,;#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([";", ";"]))
+ end
+
+ def test_tab_values
+ assert_equal(%Q[\t,\t#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["\t", "\t"]))
+ end
+
+ def test_col_sep
+ assert_equal(%Q[a;b;;c#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["a", "b", nil, "c"],
+ col_sep: ";"))
+ assert_equal(%Q[a\tb\t\tc#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["a", "b", nil, "c"],
+ col_sep: "\t"))
+ end
+
+ def test_row_sep
+ assert_equal(%Q[a,b,,c\r\n],
+ generate_line(["a", "b", nil, "c"],
+ row_sep: "\r\n"))
+ end
+
+ def test_force_quotes
+ assert_equal(%Q["1","b","","already ""quoted"""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([1, "b", nil, %Q{already "quoted"}],
+ force_quotes: true))
+ end
+
+ def test_encoding_utf8
+ assert_equal(%Q[あ,い,う#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["あ" , "い", "う"]))
+ end
+
+ def test_encoding_euc_jp
+ row = ["あ", "い", "う"].collect {|field| field.encode("EUC-JP")}
+ assert_equal(%Q[あ,い,う#{$INPUT_RECORD_SEPARATOR}].encode("EUC-JP"),
+ generate_line(row))
+ end
+
+ def test_encoding_with_default_internal
+ with_default_internal(Encoding::UTF_8) do
+ row = ["あ", "い", "う"].collect {|field| field.encode("EUC-JP")}
+ assert_equal(%Q[あ,い,う#{$INPUT_RECORD_SEPARATOR}].encode("EUC-JP"),
+ generate_line(row, encoding: Encoding::EUC_JP))
+ end
+ end
+
+ def test_with_default_internal
+ with_default_internal(Encoding::UTF_8) do
+ row = ["あ", "い", "う"].collect {|field| field.encode("EUC-JP")}
+ assert_equal(%Q[あ,い,う#{$INPUT_RECORD_SEPARATOR}].encode("EUC-JP"),
+ generate_line(row))
+ end
+ end
+end
+
+class TestCSVWriteGeneralGenerateLine < Test::Unit::TestCase
+ include TestCSVWriteGeneral
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate_line(row, **kwargs)
+ end
+end
+
+class TestCSVWriteGeneralGenerate < Test::Unit::TestCase
+ include TestCSVWriteGeneral
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate(**kwargs) do |csv|
+ csv << row
+ end
+ end
+end
diff --git a/test/csv/write/test_quote_empty.rb b/test/csv/write/test_quote_empty.rb
new file mode 100644
index 0000000..70f73da
--- /dev/null
+++ b/test/csv/write/test_quote_empty.rb
@@ -0,0 +1,70 @@
+# -*- coding: utf-8 -*-
+# frozen_string_literal: false
+
+require_relative "../helper"
+
+module TestCSVWriteQuoteEmpty
+ def test_quote_empty_default
+ assert_equal(%Q["""",""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([%Q["], ""]))
+ end
+
+ def test_quote_empty_false
+ assert_equal(%Q["""",#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([%Q["], ""],
+ quote_empty: false))
+ end
+
+ def test_empty_default
+ assert_equal(%Q[foo,"",baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "", "baz"]))
+ end
+
+ def test_empty_false
+ assert_equal(%Q[foo,,baz#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["foo", "", "baz"],
+ quote_empty: false))
+ end
+
+ def test_empty_only_default
+ assert_equal(%Q[""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([""]))
+ end
+
+ def test_empty_only_false
+ assert_equal(%Q[#{$INPUT_RECORD_SEPARATOR}],
+ generate_line([""],
+ quote_empty: false))
+ end
+
+ def test_empty_double_default
+ assert_equal(%Q["",""#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["", ""]))
+ end
+
+ def test_empty_double_false
+ assert_equal(%Q[,#{$INPUT_RECORD_SEPARATOR}],
+ generate_line(["", ""],
+ quote_empty: false))
+ end
+end
+
+class TestCSVWriteQuoteEmptyGenerateLine < Test::Unit::TestCase
+ include TestCSVWriteQuoteEmpty
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate_line(row, **kwargs)
+ end
+end
+
+class TestCSVWriteQuoteEmptyGenerate < Test::Unit::TestCase
+ include TestCSVWriteQuoteEmpty
+ extend DifferentOFS
+
+ def generate_line(row, **kwargs)
+ CSV.generate(**kwargs) do |csv|
+ csv << row
+ end
+ end
+end
diff --git a/test/lib/with_different_ofs.rb b/test/lib/with_different_ofs.rb
new file mode 100644
index 0000000..8b7cff4
--- /dev/null
+++ b/test/lib/with_different_ofs.rb
@@ -0,0 +1,34 @@
+# frozen_string_literal: true
+
+module DifferentOFS
+ is_output_field_separator_deprecated = false
+ verbose, $VERBOSE = $VERBOSE, true
+ stderr, $stderr = $stderr, StringIO.new
+ begin
+ ofs, $, = $,, "-"
+ is_output_field_separator_deprecated = (not $stderr.string.empty?)
+ ensure
+ $, = ofs
+ $stderr = stderr
+ $VERBOSE = verbose
+ end
+
+ unless is_output_field_separator_deprecated
+ module WithDifferentOFS
+ def setup
+ super
+ @ofs, $, = $,, "-"
+ end
+
+ def teardown
+ $, = @ofs
+ super
+ end
+ end
+
+ def self.extended(klass)
+ super(klass)
+ klass.const_set(:DifferentOFS, Class.new(klass).class_eval {include WithDifferentOFS})
+ end
+ end
+end