toyplot.data module

Classes and functions for working with raw data.

class toyplot.data.Table(data=None, index=False)[source]

Bases: object

Encapsulates an ordered, heterogeneous collection of labelled data series.

Parameters:
  • data ((data series, optional)) –

    You may initialize a toyplot.data.Table with any of the following:

    • None (the default) - creates an empty table (a table without any columns).

    • toyplot.data.Table - creates a copy of the given table.

    • collections.OrderedDict - creates a column for each key-value pair in the input, in the same order. Each value must be implicitly convertable to a numpy masked array, and every value must contain the same number of items.

    • object returned when loading a .npz file with numpy.load() - creates a column for each key-value pair in the given file, in the same order. Each array in the input file must contain the same number of items.

    • dict / collections.abc.Mapping - creates a column for each key-value pair in the input, sorted by key in lexicographical order. Each value must be implicitly convertable to a numpy masked array, and every value must contain the same number of items.

    • list / collections.abc.Sequence - creates a column for each key-value tuple in the input sequence, in the same order. Each value must be implicitly convertable to a numpy masked array, and every value must contain the same number of items.

    • numpy.ndarray - creates a column for each column in a numpy matrix (2D array). The order of the columns is maintained, and each column is assigned a unique name.

    • pandas.DataFrame - creates a column for each column in a Pandas data frame. The order of the columns is maintained.

  • index (bool or string, optional) –

    Controls whether to convert a Pandas data frame index to columns in the resulting table. Use index=False (the default) to leave the data frame index out of the table. Use index=True to include the index in the table, using default column names (hierarchical indices will be stored in the table using multiple columns). Use index=”format string” to include the index and control how the index column names are generated. The given format string can use positional {} / {0} or keyword {index} arguments to incorporate a zero-based index id into the column names.

items()[source]

Return column names and columns, in column order.

Returns:

items – Sequence of (name, column) tuples.

Return type:

list

keys()[source]

Return the table column names, in column order.

Returns:

keys

Return type:

sequence of str column names.

matrix()[source]

Convert the table to a matrix (2D numpy array).

The data type of the returned array is chosen based on the types of the columns within the table. Tables containing a homogeneous set of column types will return an array of the the same type. If the table contains one or more string columns, the results will be an array of strings.

Returns:

matrix – The returned array will have two dimensions.

Return type:

numpy.ma.MaskedArray

metadata(column)[source]

Return metadata for one of the table’s columns.

Parameters:

column (string.) – The name of an existing column.

Returns:

metadata

Return type:

dict containing key-value pairs.

property shape

The table shape (number of rows and columns).

Returns:

shape – (number of rows, number of columns) tuple.

Return type:

tuple

values()[source]

Return the table columns, in column order.

Returns:

values

Return type:

sequence of numpy.ndarray columns.

toyplot.data.cars()[source]

Return sample automobile model data.

Returns:

table – Table containing descriptions of multiple makes and models of automobile.

Return type:

toyplot.data.Table

toyplot.data.communities()[source]

Return sample community detection data.

Returns:

  • edges (numpy.ndarray) – An \(E \times 2\) matrix containing source and target vertex ids for \(E\) edges.

  • truth (numpy.ndarray) – A \(V \times 2\) matrix containing a vertex id and ground-truth community id for \(V\) vertices.

  • assigned (numpy.ndarray) – A \(V \times 2\) matrix containing a vertex id and alternate community id for \(V\) vertices.

toyplot.data.commute()[source]

Return sample OBD-II commuting data.

Returns:

table – Table containing a stream of OBD-II data collected from an automobile during a morning commute.

Return type:

toyplot.data.Table

toyplot.data.contiguous(a)[source]

Split an array into a collection of contiguous ranges.

toyplot.data.deliveries()[source]

Return sample delivery data.

Returns:

table – Table containing a stream of OBD-II data collected from an automobile during a morning commute.

Return type:

toyplot.data.Table

toyplot.data.minimax(items)[source]

Compute the minimum and maximum of an arbitrary collection of scalar- or array-like items.

The items parameter must be an iterable containing any combination of None, scalars, numpy arrays, or numpy masked arrays. None, NaN, masked values, and empty arrays are all handled correctly. Returns (None, None) if the inputs don’t contain any usable values.

Returns:

  • min (minimum value of the input arrays, or None.)

  • max (maximum value of the input arrays, or None.)

toyplot.data.read_csv(fobj, convert=False)[source]

Load a CSV (delimited text) file.

Parameters:
  • fobj (file-like object or string, required) – The file to read. Use a string filepath, an open file, or a file-like object.

  • convert (boolean, optional) – By default, the columns in a table will contain strings. If True, convert column types to integers and floats where possible.

Returns:

table

Return type:

toyplot.data.Table

Notes

read_csv() is a simple tool for use in demos and tutorials. For more full-featured delimited text parsing, you should consider the csv module included in the Python standard library, or functionality provided by numpy or Pandas.

toyplot.data.temperatures()[source]

Return sample temperature data.

Returns:

table – Table containing temperature data collected by NOAA.

Return type:

toyplot.data.Table