You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
242 lines
9.5 KiB
242 lines
9.5 KiB
4 years ago
|
////
|
||
|
Included in:
|
||
|
|
||
|
- user-manual: tables
|
||
|
////
|
||
|
|
||
|
=== Escaping the Cell Separator
|
||
|
|
||
|
The parser scans for the cell separator to partition cells _before_ it processes the cell text.
|
||
|
So even if you try to hide the cell separator using an inline passthrough, the parser will see it.
|
||
|
If the cell contain contains the cell separator, you must escape that character.
|
||
|
There are three ways to escape it:
|
||
|
|
||
|
* Prefix the character with a leading backslash (i.e., `\|`), which will be removed from the output.
|
||
|
* Use the `\{vbar}` attribute reference as a substitute.
|
||
|
* Change the cell separator used by the table.
|
||
|
|
||
|
Unless you do one of these things, the cell separator will be interpreted as a cell boundary.
|
||
|
|
||
|
Consider the following example, which escapes the cell separator using a leading backslash:
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
[cols=2*]
|
||
|
|====
|
||
|
|The default separator in PSV tables is the \| character.
|
||
|
|The \| character is often referred to as a "`pipe`".
|
||
|
|====
|
||
|
----
|
||
|
|
||
|
This table will render as follows:
|
||
|
|
||
|
.Result: Converted PSV table that contains pipe characters
|
||
|
[cols=2*]
|
||
|
|====
|
||
|
|The default separator in PSV tables is the \| character.
|
||
|
|The \| character is often referred to as a "`pipe`".
|
||
|
|====
|
||
|
|
||
|
Notice that the pipe character appears without the leading backslash (i.e., unescaped) in the rendered result.
|
||
|
|
||
|
An alternative is to use the `\{vbar}` attribute reference as a substitute.
|
||
|
This approach produces the same result as the previous example.
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
[cols=2*]
|
||
|
|====
|
||
|
|The default separator in PSV tables is the {vbar} character.
|
||
|
|The {vbar} character is often referred to as a "`pipe`".
|
||
|
|====
|
||
|
----
|
||
|
|
||
|
Escaping each cell separator character that appears in the content of a cell can be tedious.
|
||
|
There are also times when you can't or don't want to modify the cell content (perhaps because it is being included from another file).
|
||
|
To address these cases, AsciiDoc allows you to override the cell separator.
|
||
|
|
||
|
The cell separator is controlled using the `separator` attribute on the table block.
|
||
|
You'll want to select a character that will never be used for content.
|
||
|
A good candidate is the broken bar, or `¦`.
|
||
|
|
||
|
Here's the previous example rewritten using a custom separator.
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
[cols=2*,separator=¦]
|
||
|
|====
|
||
|
¦The default separator in PSV tables is the | character.
|
||
|
¦The | character is often referred to as a "`pipe`".
|
||
|
|====
|
||
|
----
|
||
|
|
||
|
Notice that it's no longer necessary to escape the pipe character in the content of the table cells.
|
||
|
You can safely use the original cell separator in the cell content and not worry about it being interpreted as the boundary of a cell.
|
||
|
|
||
|
[#delimiter-separated-values]
|
||
|
=== Delimiter-Separated Values
|
||
|
|
||
|
Tables can also be populated from data formatted as delimiter-separated values (i.e., data tables).
|
||
|
In contrast with the PSV format, in which the delimiter is placed in front of each cell value, the delimiter in a delimiter-separated format (CSV, TSV, DSV) is placed between the cell values (called a _separator_) and does not accept a cell formatting spec.
|
||
|
Each line of data is assumed to represent a single row, though you'll learn that's not a strict rule.
|
||
|
How the table data gets interpreted is controlled by the `format` and `separator` attributes on the table.
|
||
|
|
||
|
.What the delimiter?
|
||
|
****
|
||
|
Aren't comma-separated values a subset of https://en.wikipedia.org/wiki/Delimiter-separated_values[delimiter-separated values]?
|
||
|
It really depends on who you consult.
|
||
|
|
||
|
The term "`delimiter-separated values`" in this text refers to the family of data formats that use a delimiter, including comma-separated values (CSV), tab-separated values (TSV) and delimited data (DSV), all of which are supported in AsciiDoc tables.
|
||
|
CSV is the data format used most often.
|
||
|
|
||
|
"`Comma-separated values`" is really a misleading term since CSV can use delimiters other than `,` as the field separator (which, in this context, separates cells).
|
||
|
What we're really talking about is how the data is interpreted.
|
||
|
|
||
|
CSV and TSV both use a delimiter and an optional enclosing character, loosely based on https://tools.ietf.org/html/rfc4180[RFC 4180].
|
||
|
DSV (i.e., delimited data) only uses a delimiter, which can be escaped using a backslash; an enclosing character is not recognized.
|
||
|
These parsing rules are described in detail in <<data-table-formats>>.
|
||
|
****
|
||
|
|
||
|
Let's consider an example of using comma-separated values (CSV) to populate an AsciiDoc table with data.
|
||
|
To instruct the processor to read the data as CSV, set the value of the `format` attribute on the table to `csv`.
|
||
|
When the `format` attribute is set to `csv`, the default data separator is a comma (`,`), as seen in the table below.
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
include::ex-table-data.adoc[tag=csv]
|
||
|
----
|
||
|
|
||
|
.Result: Rendered CSV table
|
||
|
[width=90%]
|
||
|
include::ex-table-data.adoc[tag=csv]
|
||
|
|
||
|
This feature is particularly useful when you want to populate a table in your manuscript from data stored in a separate file.
|
||
|
You can do so using the <<user-manual#include-directive,include directive>> between the table delimiters, as shown here:
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
[%header,format=csv]
|
||
|
|===
|
||
|
\include::tracks.csv[]
|
||
|
|===
|
||
|
----
|
||
|
|
||
|
If your data is separated by tabs instead of commas, set the `format` to `tsv` (tab-separated values) instead.
|
||
|
|
||
|
Now let's consider an example of using delimited data (DSV) to populate an AsciiDoc table with data.
|
||
|
To instruct the processor to read the data as DSV, set the value of the `format` attribute on the table to `dsv`.
|
||
|
When the `format` attribute is set to `dsv`, the default data separator is a colon (`:`), as seen in the table below.
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
include::ex-table-data.adoc[tag=dsv]
|
||
|
----
|
||
|
|
||
|
.Result: Rendered DSV table
|
||
|
[width=90%]
|
||
|
include::ex-table-data.adoc[tag=dsv]
|
||
|
|
||
|
==== Data Table Formats
|
||
|
|
||
|
The CSV and TSV data formats are parsed differently from the DSV data format.
|
||
|
The following two sections outline those differences.
|
||
|
|
||
|
===== CSV and TSV
|
||
|
|
||
|
Table data in either CSV or TSV format is parsed according to the following rules, loosely based on https://tools.ietf.org/html/rfc4180[RFC 4180]:
|
||
|
|
||
|
* The default delimiter for CSV is a comma (`,`) while the default delimiter for TSV is a tab character.
|
||
|
* Blank lines are skipped (unless enclosed in a quoted value).
|
||
|
* Whitespace surrounding each value is stripped.
|
||
|
* Values can be enclosed in double quotes (`"`).
|
||
|
** A quoted value may contain zero or more separator or newline characters.
|
||
|
** A newline begins a new row unless the newline is enclosed in double quotes.
|
||
|
** A quoted value may include the double quote character if escaped using another double quote (`""`).
|
||
|
** Newlines in quoted values are retained (as of 1.5.7).
|
||
|
* If rows do not have the same number of cells ("`ragged`" tables), cells are shuffled to fully fill the rows.
|
||
|
** This is different behavior than Excel, which pads short rows with empty cells.
|
||
|
** Extra cells at the end of the last row get dropped.
|
||
|
** As a rule of thumb, data for a single row should be on the same line.
|
||
|
|
||
|
===== DSV
|
||
|
|
||
|
Table data in DSV format is parsed according to the following rules:
|
||
|
|
||
|
* The default delimiter for DSV is a colon (`:`).
|
||
|
* Blank lines are skipped.
|
||
|
* Whitespace surrounding each value is stripped.
|
||
|
* The delimiter character can be included in the value if escaped using a single backslash (`\:`).
|
||
|
* If rows do not have the same number of cells ("`ragged`" tables), cells are shuffled to fully fill the rows.
|
||
|
|
||
|
==== Custom Delimiters
|
||
|
|
||
|
Each data format has a default separator associated with it (csv = comma, tsv = tab, dsv = colon), but the separator can be changed to any character (or even a string of characters) by setting the `separator` attribute on the table.
|
||
|
|
||
|
Here's an example of a DSV table that uses a custom separator character (i.e., delimiter):
|
||
|
|
||
|
.A DSV table with a custom separator
|
||
|
[source]
|
||
|
----
|
||
|
[format=dsv,separator=;]
|
||
|
|===
|
||
|
a;b;c
|
||
|
d;e;f
|
||
|
|===
|
||
|
----
|
||
|
|
||
|
TIP: To make a TSV table, you can set the `format` attribute to `csv` and the separator to `\t`.
|
||
|
Though the `tsv` format is preferred.
|
||
|
|
||
|
The separator is independent of the processing rules for the format.
|
||
|
If you set `format=dsv` and `separator=,`, the data will be processed using the DSV rules, even though the data looks like CSV.
|
||
|
|
||
|
==== Shorthand Notation for Data Tables
|
||
|
|
||
|
Asciidoctor provides shorthand notation for specifying the data format of a table.
|
||
|
The first position of the table block delimiter (i.e., `|===`) can be replaced by a built-in delimiter to set the table format (e.g., `,===` for CSV).
|
||
|
|
||
|
To make a CSV table, you can use `,===` as the table block delimiter:
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
include::ex-table-data.adoc[tag=s-csv]
|
||
|
----
|
||
|
|
||
|
.Result: Rendered CSV table using shorthand syntax
|
||
|
[width=90%]
|
||
|
include::ex-table-data.adoc[tag=s-csv]
|
||
|
|
||
|
To make a DSV table, you can use `:===` as the table block delimiter:
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
include::ex-table-data.adoc[tag=s-dsv]
|
||
|
----
|
||
|
|
||
|
.Result: Rendered DSV table using shorthand syntax
|
||
|
[width=90%]
|
||
|
include::ex-table-data.adoc[tag=s-dsv]
|
||
|
|
||
|
When using either the CSV or DSV shorthand, you do not need to set the `format` attribute as it's implied.
|
||
|
|
||
|
To make a TSV table, you can set the `format` attribute to `tsv` instead of having to set the `format` to `csv` and the separator to `\t`.
|
||
|
In this case, you can use either `|===` or `,===` as the table block delimiter.
|
||
|
There is no special delimited block notation for a TSV table.
|
||
|
|
||
|
==== Formatting Cells in a Data Table
|
||
|
|
||
|
The delimited formats do not provide a way to express formatting of individual table cells.
|
||
|
Instead, you can apply cell formatting to all cells in a given column using the `cols` spec on the table:
|
||
|
|
||
|
[source]
|
||
|
----
|
||
|
[format=csv,cols="1h,1a"]
|
||
|
|===
|
||
|
Sky,image::sky.jpg[]
|
||
|
Forest,image::forest.jpg[]
|
||
|
|===
|
||
|
----
|
||
|
|
||
|
Data tables do not support cells that span multiple rows or columns, since that information can only be expressed at the cell level.
|
||
|
You are advised to use the PSV format if you need that functionality.
|