Cinchoo ETL — Quick Start: Converting CSV to JSON File

Cinchoo
4 min readJun 16, 2021

--

1. Introduction

ChoETL is an open source ETL (extract, transform and load) framework for .NET. It is a code based library for extracting data from multiple sources, transforming, and loading into your very own data warehouse in .NET environment. You can have data in your data warehouse in no time.

This article talks about converting CSV file to JSON format file using Cinchoo ETL framework. It is very simple to use, with few lines of code, the conversion can be done. You can convert large files as the conversion process is stream based, quite fast and with low memory footprint.

2. Requirement

This framework library is written in C# using .NET 4.5/.NET Core Framework.

3. How to Use

3.1 Sample Data

Let’s begin by looking into a simple example of converting the below CSV file.

Listing 3.1.1 Sample CSV File

firstName,lastName,email,phoneNumber,salary
John,Doe,john@doe.com,0123456789,1200000
Jane,Doe,jane@doe.com,9876543210,800000
James,Bond,james.bond@mi6.co.uk,0612345678,110000

3.2 Install Library

Next, install ChoETL.JSON nuget package. To do this, run the following command in the Package Manager Console.

.NET Standard Framework

Install-Package ChoETL.JSON

.NET Core

Install-Package ChoETL.JSON.NETStandard

Now add ChoETL namespace to the program.

using ChoETL;

3.3 Quick Conversion

Lets use the library to convert the CSV file to JSON formatted file. It is as simple as can be done with few lines of code. No POCO class needed. It is fast, stream based, and consume low memory.

Listing 3.3.1. Quick CSV to JSON file conversion

private static void ConvertAllData()
{
using (var jw = new ChoJSONWriter("sample.json"))
{
using (var cr = new ChoCSVReader("sample.csv").WithFirstLineHeader())
{
jw.Write(cr);
}
}
}

Create an instance of ChoJSONWriter for producing JSON (sample.json) file. Then create an instance of ChoCSVReader object for reading sample.csv file. Finally call 'Write' method on JSONWriter to write the objects to the output file.

Listing 3.3.2. The JSON output file

[
{
"firstName":"John",
"lastName":"Doe",
"email":"john@doe.com",
"phoneNumber":"0123456789",
"salary":"1200000"
},
{
"firstName":"Jane",
"lastName":"Doe",
"email":"jane@doe.com",
"phoneNumber":"9876543210",
"salary":"800000"
},
{
"firstName":"James",
"lastName":"Bond",
"email":"james.bond@mi6.co.uk",
"phoneNumber":"0612345678",
"salary":"110000"
}
]

3.4 Selective Columns Conversion

In some cases, you may want to take control of outputting selective columns in JSON output file. And you want to specify and treat the fields in the native format, so that the JSON output can be produced in cleaner way.

In the demo below, we going to choose the ‘firstName’, ‘lastName’, and ‘salary’ columns and specify the ‘salary’ as double datatype.

Listing 3.4.1. Quick CSV to JSON file conversion

private static void ConvertAllDataWithNativetype()
{
using (var jw = new ChoJSONWriter("sample.json"))
{
using (var cr = new ChoCSVReader("sample.csv")
.WithFirstLineHeader()
.WithField("firstName")
.WithField("lastName")
.WithField("salary", fieldType: typeof(double))
)
{
foreach (var x in cr)
Console.WriteLine(ChoUtility.ToStringEx(x));
//jw.Write(cr);
}
}
}

In above, all the data elements are treated and converted as text. If you want to add type to some column and output them in native format, you can do so by using ‘WithField’ method.

Listing 3.4.2. The JSON output file

[
{
"firstName":"John",
"lastName":"Doe",
"salary":1200000
},
{
"firstName":"Jane",
"lastName":"Doe",
"salary":800000
},
{
"firstName":"James",
"lastName":"Bond",
"salary":110000
}
]

3.3 Using POCO Object

This approach shows you how to define POCO entity class and use them for the conversion process. This approach is more type safe and fine control over the conversion process like doing property validation, consuming callback machanism, etc.

First, create a class with properties along with JSONPath expression to it in a flat out structure. Below, you will find an example of a class which does this.

Listing 3.3.1. Mapping Class

public class Employee
{
[ChoJSONRecordField]
public string FirstName { get; set; }
[ChoJSONRecordField]
public string LastName { get; set; }
[ChoJSONRecordField]
public int Age { get; set; }
[ChoJSONRecordField(JSONPath = "$.address.streetAddress")]
public string StreetAddress { get; set; }
[ChoJSONRecordField(JSONPath = "$.address.city")]
public string City { get; set; }
[ChoJSONRecordField(JSONPath = "$.address.state")]
public string State { get; set; }
[ChoJSONRecordField(JSONPath = "$.address.postalCode")]
public string PortalCode { get; set; }
[ChoJSONRecordField(JSONPath = "$.phoneNumber[?(@.type=='home')].number")]
public string Phone { get; set; }
[ChoJSONRecordField(JSONPath = "$.phoneNumber[?(@.type=='fax')].number")]
public string Fax { get; set; }
}

Then use this class as below to do the conversion of the file.

Listing 3.3.2. Using POCO object to convert JSON to CSV file

private static void UsingPOCO()
{
using (var csv = new ChoCSVWriter("emp.csv").WithFirstLineHeader())
{
using (var json = new ChoJSONReader<Employee>("emp.json"))
{
csv.Write(json);
}
}
}

3.4 Using Projection

This approach shows how to use the LINQ projection method to convert the JSON file to CSV file. As you may know, JSON is a hierarchical, relational, and structured data, and CSV is not. If you have the objects produced from JSON reader in hierarchical format, you must flatten out using LINQ projection and feed them to CSV writer to create the CSV file. The sample below shows how to do it.

Listing 3.4.1. Using Projection to convert JSON to CSV file

private static void UsingProjection()
{
using (var csv = new ChoCSVWriter("emp.csv").WithFirstLineHeader())
{
using (var json = new ChoJSONReader("emp.json"))
{
csv.Write(json.Select(i => new {
FirstName = i.firstName,
LastName = i.lastName,
Age = i.age,
StreetAddress = i.address.streetAddress,
City = i.address.city,
State = i.address.state,
PostalCode = i.address.postalCode,
Phone = i.phoneNumber[0].number,
Fax = i.phoneNumber[1].number
}));
}
}
}

Download the sample attached above, try it.

--

--

Cinchoo
Cinchoo

Written by Cinchoo

ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)

Responses (1)