CSV and JSON are both standard text-based ways to represent data, but their strengths are entirely different. CSV is suited to a table of rows and columns (two dimensions), while JSON is suited to structured data with nesting and types. This article lays out, accurately, how the two differ in mechanism, the points where conversion between them goes wrong, character encodings and delimiters, and when to use each.
1. What CSV is — rows and columns, delimiters, RFC 4180
CSV (Comma-Separated Values) is a two-dimensional table format in which each line is one record and each column is separated by a comma (,). In most cases the first line is used as the column names (the header). The format is loosely standardized by RFC 4180, but be aware that many dialects exist in practice.
id,name,age
1,Alice,30
2,Bob,25
Quoting and escaping
When a value itself contains the delimiter, a newline, or a double quote, plain comma-splitting breaks. RFC 4180 lets you wrap the whole value in double quotes (") to signal "this is a single value." In addition, a double quote written inside a quoted value is escaped by doubling it ("").
- Value containing a comma:
Tokyo, Japanis wrapped as"Tokyo, Japan". - Value containing a newline: a newline inside a cell is fine too, as long as the whole value is wrapped in double quotes, so it is treated as one value.
- Value containing a double quote:
She said "hi"is written as"She said ""hi""".
id,note
1,"Tokyo, Japan"
2,"She said ""hi"""
3,"line1
line2"
split(",") yourself breaks easily on such values.
2. What JSON is — structured, nestable, typed
JSON (JavaScript Object Notation) is a text format that represents structured data by nesting objects (sets of key-value pairs, { }) and arrays (ordered lists, [ ]). Unlike CSV, a major feature is that values carry types.
- Distinguishes types: a string (
"30") and a number (30), as well as booleans (true/false) andnull, are clearly separated. - Can nest: you can put objects and arrays inside an object, representing hierarchical structure directly.
- Can hold arrays: a single key can hold multiple values (a list).
{
"id": 1,
"name": "Alice",
"age": 30,
"active": true,
"address": { "city": "Tokyo", "zip": "100-0001" },
"tags": ["admin", "staff"]
}
In this way JSON naturally expresses the hierarchy, arrays, and types that do not fit into a single CSV cell. This expressiveness is why it is widely used for web API responses and application configuration files.
3. Comparison table — readability, nesting, types, size, use
Here are their characteristics at a glance. The point is not which is superior, but which fits the shape of your data.
| Aspect | CSV | JSON |
|---|---|---|
| Data structure | Two-dimensional table (rows and columns) | Nestable hierarchy and arrays |
| Types | In principle all strings (untyped) | Distinguishes string, number, boolean, null |
| Readability | Easy to read for table data | Readable as nesting deepens / can be verbose |
| Size | Small (key names not repeated) | Larger (each element carries a key name) |
| Spreadsheet fit | Opens directly (Excel, etc.) | Hard to expand into a table as-is |
| Main use | Table data, bulk export/import | Web APIs, config files, structured data |
For the same table data, CSV tends to be smaller in size. On the other hand, when you need nesting or type distinctions, a CSV of only rows and columns cannot express it, and JSON becomes necessary.
4. How conversion works — headers as keys / flattening nesting
CSV → JSON is straightforward
The basic approach to converting CSV to JSON is to make the header row the keys of each object, turn each line from the second onward into one object, and gather them into an array.
id,name,age
1,Alice,30
↓
[
{ "id": "1", "name": "Alice", "age": "30" }
]
Watch out for types here. CSV values are inherently all strings, so by default even numbers are read as strings, as with "30" above. If you want numbers or booleans, you must explicitly convert types during conversion, declaring "this column is numeric" (automatic inference can also cause wrong conversions).
JSON → CSV is the hard direction (flattening)
The reverse is not straightforward. JSON can hold nesting and arrays, but CSV is a two-dimensional table, so you must decide how to project the hierarchy into columns (flattening). Two representative issues are:
- Use the union of keys as the header: objects may not all have the same keys. Take the union of keys that appear across all objects as the columns, and leave missing values as empty cells.
- Flattening nesting and arrays: you decide whether to flatten keys with dots, such as
address.city, and how to fit an array into a single cell (store it as a JSON string, add a column per element, etc.). All of these are custom conventions; there is no standard way to do them.
{ "id": 1, "address": { "city": "Tokyo" }, "tags": ["a","b"] }
↓ (one example of flattening)
id,address.city,tags
1,Tokyo,"[""a"",""b""]"
5. Character encoding, BOM, delimiters (TSV / semicolons)
Because CSV is loosely specified, it is a format prone to trouble with character encoding and delimiters. JSON, by contrast, assumes UTF-8 by standard, so this kind of problem is far less likely.
- Character encoding: it is safest to use UTF-8. In Japanese environments, Shift_JIS CSV still lingers, and opening it as UTF-8 produces mojibake (garbled text). Always align the encoding when handing files off.
- BOM: some spreadsheet software does not recognize a file as UTF-8 unless it starts with a BOM (Byte Order Mark) for UTF-8, and shows garbled text otherwise. Conversely, some processing wrongly reads the BOM as part of the data, so be deliberate about whether to include one.
- Delimiters: characters other than commas are used too. Tab-separated is TSV, and depending on locale or settings, a semicolon (
;) delimiter is used (notably in regions that use a comma as the decimal mark). Mistaking the delimiter leaves everything crammed into one column. - Line endings: RFC 4180 uses
CRLFas the baseline, but many CSV files use onlyLF, and a mix can break parsing.
6. When to use which — spreadsheets / config / APIs
Finally, here is how to choose in practice, by use case.
- Spreadsheets and data exchange → CSV: it opens directly in Excel or Google Sheets and is ideal for bulk export/import of flat table data. It also suits handing off large volumes with many rows.
- Configuration files → JSON: it handles nested settings and typed values (numbers, booleans), and many languages and tools can read and write it by default (note that comments are not allowed).
- Web APIs → JSON: because it carries hierarchy, arrays, and types directly, it is the standard format for API requests and responses.
When in doubt, use the test "does it fit in a single table?" If it does, CSV; if you need nesting, arrays, or type distinctions, JSON. Reasoning this way, you will rarely be far off.
Free Tool Try it with the CSV ⇄ JSON converter Convert between CSV and JSON in your browser. It supports converting with the header as keys and copying the result.Frequently Asked Questions (FAQ)
Should I use CSV or JSON?
Choose based on the shape of your data. CSV suits simple tabular data with one record per row, exchanges with spreadsheet software, and bulk processing of huge datasets. JSON suits data where you need nesting and arrays, where you must distinguish types such as string, number, boolean, and null, and exchanges over web APIs. Remember: a flat table is CSV, structured data is JSON, and you will rarely go wrong.
What is quoting in CSV?
When a value contains the delimiter (a comma), a newline, or a double quote itself, you wrap the whole value in double quotes to signal that it is a single value. Under RFC 4180, a double quote written inside a quoted value is escaped by doubling it (""). For example, if the value is She said "hi", you write it in CSV as "She said ""hi""".
Can nested data be represented in CSV?
CSV is essentially a two-dimensional table, so it cannot directly represent nested structures like JSON. In practice you work around this by flattening nested keys into dot-separated column names such as user.address.city, or by storing an array as a JSON string inside a cell. These are custom conventions, however, so you must define your flattening rules and verify that no information is lost on a round trip through the conversion.