Table Coordinates¶

Data tables, with rows containing observations and columns containing (depending on your field) features, dimensions, variables, or series, are arguably the cornerstone of science. Much of the functionality of Toyplot or any other plotting package can be reduced to a process of mapping data series from tables to properties like coordinates and colors. Nevertheless, much tabular information is still best understood in its “native” tabular form, and we believe that even a humble table benefits from good layout and design - which is why Toyplot supports rendering tables as data graphics, treating them as first-class objects instead of specialized markup. This means that you can combine high-quality tables and plots in innovative ways, and save them using any of the formats supported by Toyplot, including HTML, SVG, PDF, and PNG.

To accomplish this, Toyplot provides toyplot.coordinates.Table, which is a specialized coordinate system. Just like Cartesian Coordinates, and Numberline Coordinates, tables map domain coordinates to canvas coordinates. Unlike more traditional coordinate systems, tables map integer coordinates that increase from left-to-right and top-to-bottom to rectangular regions of the canvas called cells.

Be careful not to confuse the table coordinates described in this section with Data Tables, which are purely a data storage mechanism. To make this distinction clear, let’s start by loading some sample data into a data table:

[1]:

import numpy
import toyplot.data
data_table = toyplot.data.temperatures()
data_table = data_table[:10]

Now, we can display the data using a set of table coordinates:

[2]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.cells.column[[0, 1]].width = 150

With surprisingly little effort, this produces a very clean, easy to read table. Note that, like regular Cartesian coordinates, the table coordinates fill the available Canvas by default, so you can adjust your canvas width and height to expand or contract the rows and columns in your table. By default, each row and column in the table receives an equal amount of the available space, unless they are individually overridden as we’ve done here. Of course, you’re free to use all of the mechanisms outlined in Canvas Layout to add multiple tables to a canvas.

In this case, the data file contains a series TOBS that we don’t need, so let’s discard it and reorder the TMIN and TMAX columns to put them in a more natural order:

[3]:

data_table = data_table[["STATION", "STATION_NAME", "DATE", "TMIN", "TMAX"]]

[4]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.cells.column[[0, 1]].width = 170

As it happens, all of the columns in our data table contain string values. Note that the labels and columns in the graphic are all left-justified, the default for string data. Let’s see what happens when we convert TMIN and TMAX to integers:

[5]:

data_table["TMIN"] = data_table["TMIN"].astype("int")
data_table["TMAX"] = data_table["TMAX"].astype("int")

[6]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.cells.column[[0, 1]].width = 170

After converting TMIN and TMAX to integers, they are right-justified within the table, so their digits all align, making it easy to judge magnitudes. It turns out that the data in these columns are actually integers representing tenths-of-a-degree Celsius, so let’s convert them to floating-point Celsius degrees and see what happens:

[7]:

data_table["TMIN"] = data_table["TMIN"] * 0.1
data_table["TMAX"] = data_table["TMAX"] * 0.1

[8]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.cells.column[[0, 1]].width = 170

Now, all of the decimal points are properly aligned within each column, even for values without a decimal point!

If you ever want to change the way data in a table is formatted, you can do so by assigning a different format object. For example, you could switch to a fixed number of digits to the right of the decimal point:

[9]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table)
table.cells.column[[0, 1]].width = 170
table.cells.column[3:5].format = toyplot.format.FloatFormatter("{:.1f}")

Next, let’s title our figure. As with the other coordinate systems, tables have a label property that can be set at construction time:

[10]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170

You also have complete control over the gridlines that separate the cells in a table:

[11]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170
table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"
table.cells.grid.hlines[1,...] = "double"

For a table with \(M\) rows and \(N\) columns, the table.cells.grid.hlines matrix controls the appearance of \(M+1 \times N\) horizontal lines, and table.cells.grid.vlines controls \(M \times N+1\) vertical lines. Use “single” for single lines, “double” for double lines, or any value that evaluates to False to hide the lines.

Suppose you wanted to highlight the observations in the dataset with the highest high temperature and the lowest low temperature. You could do so by changing the style of the given rows:

[12]:

low_index = numpy.argsort(data_table["TMIN"])[0]
high_index = numpy.argsort(data_table["TMAX"])[-1]

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170
table.cells.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.cells.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}

Wait a second … those colored rows are both off-by-one! The actual minimum and maximum values are in the rows immediately following the colored rows. What happened? Note that the table has an “extra” row for the column headers, so row zero in the data is actually row one in the table, so that the data rows have “one-based” indices instead of the “zero-based” indices that all good programmers should expect. We could fix the problem by offsetting the indices we calculated from the raw data, but that would be error-prone and annoying. The offset would also change if we ever changed the number of extra rows before the body of the table (which we’ll see an example of in a moment).

What we really need is a way to refer to the “top” rows and the “body” rows in the table separately, using zero-based indices for each. Fortunately, Toyplot does just that - we can use special accessor attributes to target our changes to the top or the body, using coordinates that won’t be affected by changes to other parts of the table:

[13]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170
table.body.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}

Now the correct rows have been highlighted. Let’s add another row of headers to verify that the highlighting isn’t affected:

[14]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, trows=2, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170
table.top.grid.hlines[...] = "single"
table.top.grid.vlines[...] = "single"
table.body.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}

Sure enough, the correct rows are still highlighted, the top section of the table contains a second row, and we made the extra row obvious with some grid lines. Note that by accessing the grid via the “top” accessor, we were able to easily alter just the grid lines for the top cells. Let’s take things a step further and provide some additional labels in the new top row:

[15]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, trows=2, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170
table.body.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}
table.top.grid.hlines[...] = "single"
table.top.grid.vlines[...] = "single"
table.top.cell[0, 0:2].merge().data = "Location"
table.top.cell[0, 3:6].merge().data = u"Temperature \u00b0C"

Note the use of merge() to merge together ranges of cells, along with the data attribute to assign new cell contents.

Also, you may have noticed that the merged cells took on the attributes (alignment, style, etc.) of the cells that were merged, which is why the “Location” label is left-justified, while the “Temperature” label is centered. Let’s center-justify the Location label, make both labels a little more prominent, and lose the gridlines:

[16]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, trows=2, label="Temperature Readings")
table.cells.column[[0, 1]].width = 170
table.body.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}
merged = table.top.cell[0, 0:2].merge()
merged.data = "Location"
merged.align = "center"
merged.lstyle = {"font-size":"14px"}
merged = table.top.cell[0, 3:6].merge()
merged.data = u"Temperature \u00b0C"
merged.lstyle = {"font-size":"14px"}

Finally, let’s finish-off our grid by plotting the minimum and maximum temperatures vertically along the right-hand side of the table. This will provide an intuitive guide to trends in the data. To do this, we’ll begin by adding two extra columns to the table using the rcolumns parameter, duplicating the TMIN and TMAX data in the new cells, and readjusting column widths to accomodate the new columns:

[17]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, trows=2, rcolumns=2, label="Temperature Readings")
table.body.column[[0, 1]].width = 150
table.body.column[2].width = 70
table.body.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}
merged = table.top.cell[0, 0:2].merge()
merged.data = "Location"
merged.align = "center"
merged.lstyle = {"font-size":"14px"}
merged = table.cells.cell[0, 3:6].merge()
merged.data = u"Temperature \u00b0C"
merged.lstyle = {"font-size":"14px"}

table.right.column[0].data = data_table["TMIN"]
table.right.column[1].data = data_table["TMAX"]

# This is to produce a clean figure, we won't need it in the next step.
table.right.column[0:2].format = toyplot.format.FloatFormatter()

Now, we can replace the new cells with a line plot that uses their data. First, we embed a set of cartesian coordinates in the two columns, then we use cell_plot() to create a line plot based on the data in the underlying cells:

[18]:

canvas = toyplot.Canvas(width=700, height=400)
table = canvas.table(data_table, trows=2, rcolumns=2, label="Temperature Readings")
table.body.column[[0, 1]].width = 150
table.body.column[2].width = 70
table.body.row[low_index].lstyle = {"font-weight":"bold", "fill":"blue"}
table.body.row[high_index].lstyle = {"font-weight":"bold", "fill":"red"}
merged = table.top.cell[0, 0:2].merge()
merged.data = "Location"
merged.align = "center"
merged.lstyle = {"font-size":"14px"}
merged = table.cells.cell[0, 3:6].merge()
merged.data = u"Temperature \u00b0C"
merged.lstyle = {"font-size":"14px"}

table.right.column[0].data = data_table["TMIN"]
table.right.column[1].data = data_table["TMAX"]
table.right.column[0:2].width = 40

axes = table.right.column[0:2].cartesian()
mark = axes.cell_plot(color=["blue", "red"], marker="o", style={"stroke-width":1.0})

Because the axes were embedded across two columns, the plot contains two series. When embedding coordinates in table cells the axes, ticks, and labels are hidden by default to avoid visual clutter. The combination of tabular and plotted data demonstrates how Toyplot makes it easy to display data in innovative ways that other libraries can’t.

Regions¶

In the above examples we accessed four distinct regions of the table:

cells - a special region containing every cell in the table.
body - to access the cells that make up the bulk of the table.
top - to access “header” cells above the body region, typically used for column labels.
right - to access additional cells to right of the body region that we used to embed a plot.

As you might imagine, tables actually contain nine distinct regions that include body, top, right, bottom, left, top.left, top.right, bottom.right, and bottom.left. Below, we demonstrate using explicit row and column counts to create an empty table so we can highlight all nine regions:

[19]:

canvas, table = toyplot.table(rows=4, columns=4, trows=2, brows=2, lcolumns=2, rcolumns=2, width=400, height=400)

table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"

table.top.left.cells.style = {"fill":"red", "opacity":0.5}
table.top.cells.style = {"fill":"orange", "opacity":0.5}
table.top.right.cells.style = {"fill":"yellow", "opacity":0.5}
table.right.cells.style = {"fill":"greenyellow", "opacity":0.5}
table.bottom.right.cells.style = {"fill":"green", "opacity":0.5}
table.bottom.cells.style = {"fill":"aqua", "opacity":0.5}
table.bottom.left.cells.style = {"fill":"blue", "opacity":0.5}
table.left.cells.style = {"fill":"purple", "opacity":0.5}
table.body.cells.style = {"fill":"white"}

Note how the trows, brows, lcolumns, and rcolumns parameters control the number of cells in the top, bottom, left, and right regions respectively. Typically, you would use these regions to display header, label, and summary information for the data in the body region, although you are free to use any region for any purpose you like.

Indexing¶

When accessing table cells, rows, columns and other objects, you can use any of the indexing techniques provided by numpy, including individual indices, slicing, and advanced indexing. In the following we will call-out examples of each. To begin, you can use individual indices for one-at-a-time indexing of rows, columns, and cells:

[20]:

canvas, table = toyplot.table(rows=9, columns=9, width=300, height=300)
table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"

table.cells.column[1].style = {"fill":"red", "opacity":0.5}
table.cells.row[2].style = {"fill":"green", "opacity":0.5}
table.cells.cell[4, 3].style = {"fill":"blue", "opacity":0.5}

Note above that overlapping assignments overwrite cell parameters. As is normal in Python, negative indices are interpreted relative to the end of a range:

[21]:

canvas, table = toyplot.table(rows=9, columns=9, width=300, height=300)
table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"

table.cells.column[-1].style = {"fill":"red", "opacity":0.5}
table.cells.row[-2].style = {"fill":"green", "opacity":0.5}
table.cells.cell[-4, -3].style = {"fill":"blue", "opacity":0.5}

Of course, you can use slices to access ranges of rows, columns, and cells:

[22]:

canvas, table = toyplot.table(rows=9, columns=9, width=300, height=300)
table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"

table.cells.column[0:2].style = {"fill":"red", "opacity":0.5}
table.cells.row[0:2].style = {"fill":"green", "opacity":0.5}
table.cells.cell[3:5, 3:5].style = {"fill":"blue", "opacity":0.5}

And you can use slicing with steps to access every-other-column, every-third-row, etc:

[23]:

canvas, table = toyplot.table(rows=9, columns=9, width=300, height=300)
table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"

table.cells.column[0:6:2].style = {"fill":"red", "opacity":0.5}
table.cells.row[0:8:3].style = {"fill":"green", "opacity":0.5}
table.cells.cell[5:9:2, 6:9:2].style = {"fill":"blue", "opacity":0.5}

Plus, you can use advanced slicing, such as an explicit list of indices:

[24]:

canvas, table = toyplot.table(rows=9, columns=9, width=300, height=300)
table.cells.grid.hlines[...] = "single"
table.cells.grid.vlines[...] = "single"

table.cells.column[[1,2,4]].style = {"fill":"red", "opacity":0.5}
table.cells.row[[1,7,8]].style = {"fill":"green", "opacity":0.5}
table.cells.cell[[3,4,6], 6:9].style = {"fill":"blue", "opacity":0.5}

Of course, you can use any indexing technique to set any property of a cell, and you can use all of these techniques when working with gridlines and other properties of the table:

[25]:

canvas, table = toyplot.table(rows=9, columns=9, width=300, height=300)

table.cells.column[3].data = "A"
table.cells.column[3].style = {"fill":"red", "opacity":0.5}
table.cells.column[2:5].width = 35

table.cells.row[2:4].data = "B"
table.cells.row[2:4].style = {"fill":"green", "opacity":0.5}

table.cells.cell[1:5, 1:6].lstyle = {"fill":"white"}

table.cells.grid.hlines[..., 3] = "single"
table.cells.grid.hlines[2:5, ...] = "single"

table.cells.grid.vlines[2:4, ...] = "single"
table.cells.grid.vlines[..., 3:5] = "single"

Grouping Data¶

It’s common to group-together subsets of data within a table. Toyplot currently provides three mechanisms that you may find useful for grouping. First, as we’ve already seen, you can use horizontal or vertical grid lines to separate groups:

[26]:

numpy.random.seed(1234)
data = toyplot.data.Table(numpy.random.normal(size=(10, 4)))

[27]:

canvas, table = toyplot.table(data, width=500, height=400)
table.body.grid.hlines[[3, 7],...] = "single"

Second, you could change the background colors of cells to highlight groups:

[28]:

canvas, table = toyplot.table(data, width=500, height=400)
table.body.row[3:7].style = {"fill":"#eee", "stroke":"none"}

Finally, you can use the gaps property to insert whitespace between groups:

[29]:

canvas, table = toyplot.table(data, width=500, height=400)
table.body.gaps.rows[[2,6]] = "0.5cm"

Because a Matrix Visualization is actually a table, you can use gaps to adjust its appearance, too:

[30]:

canvas, table = toyplot.matrix(data, width=400, height=700)
table.body.gaps.columns[...] = 10
table.body.gaps.rows[...] = 10

Plot Embedding¶

Nearly any type of plot can be embedded in a table, but the fact that they’re embedded introduces additional considerations. As an example, assume we have the following table:

[31]:

data = toyplot.data.Table(numpy.random.normal(loc=4, size=(10, 2)))

canvas, table = toyplot.table(data, width=250, height=400)
table.cells.cell[...].format = toyplot.format.FloatFormatter(format="{:.1f}")

As we saw earlier, we can create a set of axes that are “contained” within a range of cells, and add a plot that uses the underlying data:

[32]:

canvas, table = toyplot.table(data, width=250, height=400)
table.cells.cell[...].format = toyplot.format.FloatFormatter(format="{:.1f}")

axes = table.body.column[1].cartesian()
axes.cell_plot();

And as you might imagine, other plot types are supported:

[33]:

canvas, table = toyplot.table(data, width=250, height=400)
table.cells.cell[...].format = toyplot.format.FloatFormatter(format="{:.1f}")

axes = table.body.column[1].cartesian()
axes.cell_bars();

You can also plot data along rows instead of columns:

[34]:

data = toyplot.data.Table(numpy.random.normal(loc=4, size=(2, 10)))

canvas, table = toyplot.table(data, width=500, height=250)
table.cells.cell[...].format = toyplot.format.FloatFormatter(format="{:.1f}")

axes = table.body.row[1].cartesian()
axes.cell_bars(series="rows");

Most importantly, you will need to decide whether you want to display data stored within the table, or data from without. For example, we can also embed a plot that contains its own, arbitrary data:

[35]:

plot_data = numpy.random.normal(loc=4, size=(25, 3))

canvas, table = toyplot.table(data, width=500, height=250)
table.cells.cell[...].format = toyplot.format.FloatFormatter(format="{:.1f}")

axes = table.body.row[1].cartesian()
axes.bars(plot_data);

Note that when you do this there is no-longer any relationship between the plot and the underlying table - you are simply displaying the plot, constrained to fit in the table cells you selected when you created the axes.