Skip to content

Overview

This is an index of the main functionality available in Tabeline. Each constructor and verb has a link to more detailed documentation. The expression functions do not yet have detailed documentation, but should be self-explanatory.

Creation

These are the ways to create a DataFrame from something that is not already a DataFrame

Export

These are the various things that a DataFrame can be converted into

Verbs

Each verb is a method of DataFrame.

Column reorganization

Row removal

  • filter: Keep rows for which predicate is true
  • slice0: Keep rows given by 0-index
  • slice1: Keep rows given by 1-index
  • distinct: Drop rows with duplicate values under given columns
  • unique: Drop rows with duplicate values under all columns

Row reordering

  • sort: Sort data frame according to given columns
  • cluster: Bring rows together with same values under given columns

Column mutation

  • mutate: Create or update columns according to given expressions
  • transmute: Mutate while dropping existing columns

Grouping

  • group_by: Create a group level containing given columns
  • ungroup: Drop the last group level

Summarizing

  • summarize: Reduce each group to a single row according to given expressions

Reshaping

  • spread: Reshape from long format to wide format
  • gather: Reshape from wide format to long format

Joining

  • inner_join: Merge data frames, dropping unmatched
  • outer_join: Merge data frames, adding nulls for unmatched
  • left_join: Merge data frames, adding nulls for unmatched on the left data frame

Concatenating

Functions

These are the operators and functions available in the string expressions.

Operators

If either operand is null, the result is null.

  • x + y: x plus y or, if strings, x concatenated to y
  • x - y: x minus y
  • x * y: x times y
  • x / y: x divided by y
  • x % y: x mod y
  • x ** y: x to the power of y

Numeric to numeric broadcast

The functions in this section mathematical operations on numbers. If these functions receive scalar inputs, they return a scalar. If they receive any array inputs, any scalars are interpreted as constant vectors and an array is returned. For each null element, the result is null.

  • abs(x): Absolute value of x
  • sqrt(x): Square root of x
  • log(x): Natural logarithm of x
  • log2(x): Base-2 logarithm of x
  • log10(x): Base-10 logarithm of x
  • exp(x): Euler's number e to the power of x
  • pow(x, y): x to the power of y
  • sin(x): Sine of x
  • cos(x): Cosine of x
  • tan(x): Tangent of x
  • arcsin(x): Inverse sine of x
  • arccos(x): Inverse cosine of x
  • arctan(x): Inverse tangent of x
  • floor(x): x rounded down to the nearest integer
  • ceil(x): x rounded up to the nearest integer

Numeric to boolean broadcast

For each null element, the result is null.

  • is_nan(x): True if x value is a floating point NaN
  • is_finite(x): True if x value is a floating point finite number

Casting broadcast

The functions in this section convert values from one type to another. These are completely dependent on the behavior of the Polars cast function. Nulls are preserved.

  • to_boolean(x): Convert x from a boolean, a float, or an integer, to a boolean
  • to_integer(x): Convert x from a boolean, a float, or an integer to an integer or parse a string as an integer
  • to_float(x): Convert x from a boolean, a float, or an integer to a float or parse a string as a float
  • to_string(x): Deparse x to a string

Other broadcast

  • is_null(x): True if x is null. This is one of the few functions that returns a non-null value on null inputs.
  • if_else(condition, true_value, false_value): If condition is true, return true_value, otherwise return false_value.

Numeric to numeric reduction

The functions in this section consume an entire column of numbers and to produce a scalar number. If any element is null, the result is null.

  • std(x): Population standard deviation of x
  • var(x): Population variance of x
  • max(x): Maximum of x
  • min(x): Minimum of x
  • sum(x): Sum of x
  • mean(x): Mean of x
  • median(x): Median of x
  • quantile(x, quantile): The quantile of x obtained via linear interpolation
  • quantile must be a float literal not an expression
  • trapz(x, y): Numerically integrate y over x using the trapezoidal rule
  • interp(x, xp, fp): Linearly interpolate fp over xp at x; typically, x is a float literal

Boolean to boolean reduction

The functions in this section consume an entire column of booleans and to produce a scalar boolean. These follow Kleene logic with respect to null.

  • any(x): True if any of x are true
  • all(x): True is all of x are true

Any to any reduction

The functions in this section consume an entire column of anything and to produce a scalar of that type. These treat nulls as normal values.

  • first(x): The first value of x
  • last(x): The last value of x
  • same(x): One value of x if all values of x are the same, otherwise error

Argumentless functions

The functions in this section evalute in the context of the DataFrame, not any particular column.

  • n(): The number of rows in the DataFrame
  • row_index0(): The 0-index of each row
  • row_index1(): The 1-index of each row