Note: phiên bản Tiếng Việt của bài này ở link dưới.
https://duongnt.com/read-csv-helper-vie
Reading CSV files is one of those tasks that seem way easier at the first glance than it actually is. It is not hard to write some code to parse data from a CSV file. Making sure that code is type-safe, fast, and flexible is a much bigger challenge. Fortunately, thanks to the CsvHelper library, we don’t have to reinvent the wheel.
You can download all sample code in this article from the link below. https://github.com/duongntbk/CsvHelperDemo
Installing CsvHelper
You can install CsvHelper by running this command.
dotnet add package CsvHelper --version 27.1.1
Read CSV files with CsvHelper
This is our sample CSV file, it has three text columns and one number column.
FirstName,LastName,Age,IsActive
John,Doe,30,Yes
Jane,Doe,31,No
Duong,Nguyen,31,Yes
We will map each row into an object of type Person
.
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int? Age { get; set; }
public string IsActive { get; set; }
}
The code to read our CSV file is below.
var fileName = @"<path to our CSV file>";
var configuration = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Encoding = Encoding.UTF8, // Our file uses UTF-8 encoding.
Delimiter = "," // The delimiter is a comma.
};
using (var fs = File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
{
using (var textReader = new StreamReader(fs, Encoding.UTF8))
using (var csv = new CsvReader(textReader, configuration))
{
var data = csv.GetRecords<Person>();
foreach (var person in data)
{
// Do something with values in each row
}
}
}
The type of data
is IEnumerable<Person>
. CsvHelper will automatically map each column to the property with the same name. For example, the value in FirstName
column will be mapped into Person.FirstName
. We can then iterate data
and access the values in each row.
Change reader configuration
The CsvConfiguration
class has many configurables to control how we read a CSV file. Below are some of the more important ones.
HasHeaderRecord
: set this totrue
to treat the first row as header. The default value istrue
. Some might wonder how we can map columns in CSV files with properties in model class if we don’t have a header. You can find the answer in a later section.Quote
: the quote character in our CSV file. The default value is"
.IgnoreBlankLines
: set this totrue
to skip mapping blank lines. The default value istrue
.Delimiter
: the delimiter to separate fields. The default value is,
(The format is called COMMA-separated values after all).DetectDelimiter
: set this totrue
to automatically detect the delimiter without using the value inDelimiter
. The default value isfalse
.Encoding
: the encoding in use.
Manually map columns in a CSV file with properties in the model class
Map columns by name
Naturally, when working in Japan, one needs to handle CSV files with Japanese header. One such file can look like this.
姓,名,年齢,アクティブ
Doe,John,30,Yes
Doe,Jane,31,No
Nguyen,Duong,31,Yes
As mentioned in a previous section, in the simplest case, CsvHelper automatically maps each column in our CSV file with a property with the same name. But naming our properties in Japanese is madness, so what can we do? The ClassMap
comes to the rescue. Essentially, we subclass ClassMap
and use it to tell CsvHelper which column should be mapped to which property.
public class PersonMapByName : ClassMap<Person>
{
public PersonMapByName()
{
Map(p => p.FirstName).Name("名");
Map(p => p.LastName).Name("姓");
Map(p => p.Age).Name("年齢");
Map(p => p.IsActive).Name("アクティブ");
}
}
Then all we need to do is register this mapping class with CsvHelper before calling csv.GetRecords
.
csv.Context.RegisterClassMap<PersonMapByName>();
var data = csv.GetRecords<Person>();
It is also possible to skip some columns in the CSV file. For example, the mapping below maps only 姓
and 名
, while skipping 年齢
and アクティブ
.
public class PersonMapByName : ClassMap<Person>
{
public PersonMapByName()
{
Map(p => p.FirstName).Name("名");
Map(p => p.LastName).Name("姓");
}
}
Map columns by index
The solution above works fine if columns in our CSV file all have unique names. But if there are multiple columns with the same name then only the first column is mapped. For example, with the file below, the value of IsActive
will be Yes/No
instead of True/False
.
FirstName,LastName,Age,IsActive,IsActive
John,Doe,30,Yes,True
Jane,Doe,31,No,False
Duong,Nguyen,31,Yes,True
If we want to use True/False
here, we need to use mapping by index. Mapping by index is very similar to mapping by name, the only difference is what we chain to the Map
function. Let’s consider the following example.
public class PersonMapByIndex : ClassMap<Person>
{
public PersonMapByName()
{
Map(p => p.FirstName).Index(0); // The index starts from 0
Map(p => p.LastName).Index(1);
Map(p => p.Age).Index(2);
Map(p => p.IsActive).Index(4); // We map IsActive to the second IsActive column (index == 4)
}
}
Moreover, we can mix mapping by name and mapping by index. We just need to make sure that all properties are mapped one way or another. This is because when a mapping class exists, implicit auto mapping is disabled.
public class PersonMapByIndex : ClassMap<Person>
{
public PersonMapByName()
{
Map(p => p.FirstName).Name("FirstName"); // Map by name
Map(p => p.LastName); // Explicit auto mapping
Map(p => p.Age).Index(2); // Map by index
Map(p => p.IsActive).Index(4); // Map by index
}
}
Note: if our CSV file doesn’t even have headers then mapping by index is the only choice. In this case, we also need to set the HasHeaderRecord
flag in config to false
.
Convert data in CSV files from one type to another
Maybe you have realized that the Age
property in our Person
class is an Integer
, but a CSV file only contains text data. CsvHelper can convert data from string
into standard .NET types (Boolean, Int32, Int64, Enum,…). You can find the list of converters that CsvHelper supports in this link.
What if the type we want to map is a non-standard type. Or it is a standard type, but the string value in our CSV file is non-standard. In that case, we can write our own converter class. Let’s take another look at our first sample CSV file.
FirstName,LastName,Age,IsActive
John,Doe,30,Yes
Jane,Doe,31,No
Duong,Nguyen,31,Yes
And let’s map it to the model below. Notice that IsActive
is a bool
this time.
public class PersonV2
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int? Age { get; set; }
public bool IsActive { get; set; }
}
A custom type converter
Even though bool
is a standard .NET type, the default converter can only handle True/False
value, while our CSV file has Yes/No
. Here, we need to subclass the DefaultTypeConverter
, then override the ConvertFromString
method.
public class CustomBooleanConverter : DefaultTypeConverter
{
public override object ConvertFromString(string text, IReaderRow row, MemberMapData memberMapData)
{
return text.Equals("Yes", StringComparison.OrdinalIgnoreCase);
}
}
Although it is outside the scope of this article, if you want to write data to CSV files, then you’ll have to override the ConvertToString
method. The implementation below will convert a boolean value back into Yes/No
. And non-boolean values will throw an exception.
public override string ConvertToString(object value, IWriterRow row, MemberMapData memberMapData)
{
if (value is bool boolVal)
{
return boolVal ? "Yes" : "No";
}
throw new ArgumentException("Value must be a boolean");
}
Register the custom type converter with CsvHelper
As we can see, our converter will convert the string Yes
(case-insensitive) to true
; while other strings will be converted to false
. The next step is to tell our mapping class to use the converter by calling the TypeConverter
method.
public class PersonMapWithConverter : ClassMap<PersonV2>
{
public PersonMapWithConverter()
{
Map(p => p.FirstName);
Map(p => p.LastName);
Map(p => p.Age);
Map(p => p.IsActive).TypeConverter<CustomBooleanConverter>();
}
}
Notice that we can also call the TypeConverter
method after mapping by name or by index.
Map(p => p.IsActive).Name("IsActive").TypeConverter<CustomBooleanConverter>();
Or
Map(p => p.IsActive).Index(3).TypeConverter<CustomBooleanConverter>();
Then we can register the mapping class with CsvHelper and read the sample file like usual.
csv.Context.RegisterClassMap<PersonMapWithConverter>();
var data = csv.GetRecords<PersonV2>();
Read CSV files asynchronously
CsvHelper also supports reading CSV files asynchronously. Instead of the GetRecords
method, we will call the GetRecordsAsync
method; and the return value will be a IAsyncEnumerable<T>
.
var fileName = @"<path to our CSV file>";
var configuration = new CsvConfiguration(CultureInfo.InvariantCulture)
{
Encoding = Encoding.UTF8, // Our file uses UTF-8 encoding
Delimiter = "," // The delimiter is a comma
};
using (var fs = File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read))
{
using (var textReader = new StreamReader(fs, Encoding.UTF8))
using (var csv = new CsvReader(textReader, configuration))
{
var data = csv.GetRecordsAsync<Person>();
await foreach (var person in data) // Iterate through the collection asynchronously
{
// Do something with values in each row
}
}
}
With IAsyncEnumerable
and an await foreach
, our loop will yield to the control thread while waiting to read the next element in the collection.
Conclusion
I have used CsvHelper in many real-world projects, and it can handle files with up to a few dozen thousands records without any problem. If your files go into the million records territory then you might run into some memory issues. But until then, CsvHelper with its simple syntax can be a very good choice.
Awesome
Good tutorial.
Thanks.
Thank you. Please check out the other articles as well.