Overview
This is an index of the main functionality available in Tabeline. Each constructor and verb has a link to more detailed documentation. The expression functions do not yet have detailed documentation, but should be self-explanatory.
Creation
These are the ways to create a DataFrame
from something that is not already a DataFrame
DataFrame(**columns)
: Construct data frame directly from columnsDataFrame.from_dict(columns)
: Construct data frame directly from columnsDataFrame.read_csv(filename)
: Read data frame from CSV fileDataFrame.from_pandas(df)
: Convert from PandasDataFrame
DataFrame.from_polars(df)
: Convert from PolarsDataFrame
Export
These are the various things that a DataFrame
can be converted into
to_dict()
: Convert to dictionary of columnsto_pandas()
: Convert to PandasDataFrame
to_polars()
: Convert to PolarsDataFrame
write_csv(filename)
: Write to CSV file
Verbs
Each verb is a method of DataFrame
.
Column reorganization
Row removal
filter
: Keep rows for which predicate is trueslice0
: Keep rows given by 0-indexslice1
: Keep rows given by 1-indexdistinct
: Drop rows with duplicate values under given columnsunique
: Drop rows with duplicate values under all columns
Row reordering
sort
: Sort data frame according to given columnscluster
: Bring rows together with same values under given columns
Column mutation
mutate
: Create or update columns according to given expressionstransmute
: Mutate while dropping existing columns
Grouping
Summarizing
summarize
: Reduce each group to a single row according to given expressions
Reshaping
Joining
inner_join
: Merge data frames, dropping unmatchedouter_join
: Merge data frames, adding nulls for unmatchedleft_join
: Merge data frames, adding nulls for unmatched on the left data frame
Concatenating
concatenate_rows
: Concatenate rows of data framesconcatenate_columns
: Concatenate columns of data frames
Functions
These are the operators and functions available in the string expressions.
Operators
If either operand is null, the result is null.
x + y
:x
plusy
or, if strings,x
concatenated toy
x - y
:x
minusy
x * y
:x
timesy
x / y
:x
divided byy
x % y
:x
mody
x ** y
:x
to the power ofy
Numeric to numeric broadcast
The functions in this section mathematical operations on numbers. If these functions receive scalar inputs, they return a scalar. If they receive any array inputs, any scalars are interpreted as constant vectors and an array is returned. For each null element, the result is null.
abs(x)
: Absolute value ofx
sqrt(x)
: Square root ofx
log(x)
: Natural logarithm ofx
log2(x)
: Base-2 logarithm ofx
log10(x)
: Base-10 logarithm ofx
exp(x)
: Euler's numbere
to the power ofx
pow(x, y)
:x
to the power ofy
sin(x)
: Sine ofx
cos(x)
: Cosine ofx
tan(x)
: Tangent ofx
arcsin(x)
: Inverse sine ofx
arccos(x)
: Inverse cosine ofx
arctan(x)
: Inverse tangent ofx
floor(x)
:x
rounded down to the nearest integerceil(x)
:x
rounded up to the nearest integer
Numeric to boolean broadcast
For each null element, the result is null.
is_nan(x)
: True ifx
value is a floating pointNaN
is_finite(x)
: True ifx
value is a floating point finite number
Casting broadcast
The functions in this section convert values from one type to another. These are completely dependent on the behavior of the Polars cast
function. Nulls are preserved.
to_boolean(x)
: Convertx
from a boolean, a float, or an integer, to a booleanto_integer(x)
: Convertx
from a boolean, a float, or an integer to an integer or parse a string as an integerto_float(x)
: Convertx
from a boolean, a float, or an integer to a float or parse a string as a floatto_string(x)
: Deparsex
to a string
Other broadcast
is_null(x)
: True ifx
is null. This is one of the few functions that returns a non-null value on null inputs.if_else(condition, true_value, false_value)
: Ifcondition
is true, returntrue_value
, otherwise returnfalse_value
.
Numeric to numeric reduction
The functions in this section consume an entire column of numbers and to produce a scalar number. If any element is null, the result is null.
std(x)
: Population standard deviation ofx
var(x)
: Population variance ofx
max(x)
: Maximum ofx
min(x)
: Minimum ofx
sum(x)
: Sum ofx
mean(x)
: Mean ofx
median(x)
: Median ofx
quantile(x, quantile)
: Thequantile
ofx
obtained via linear interpolationquantile
must be afloat
literal not an expressiontrapz(x, y)
: Numerically integratey
overx
using the trapezoidal ruleinterp(x, xp, fp)
: Linearly interpolatefp
overxp
atx
; typically,x
is a float literal
Boolean to boolean reduction
The functions in this section consume an entire column of booleans and to produce a scalar boolean. These follow Kleene logic with respect to null.
any(x)
: True if any ofx
are trueall(x)
: True is all ofx
are true
Any to any reduction
The functions in this section consume an entire column of anything and to produce a scalar of that type. These treat nulls as normal values.
first(x)
: The first value ofx
last(x)
: The last value ofx
same(x)
: One value ofx
if all values ofx
are the same, otherwise error
Argumentless functions
The functions in this section evalute in the context of the DataFrame
, not any particular column.
n()
: The number of rows in theDataFrame
row_index0()
: The 0-index of each rowrow_index1()
: The 1-index of each row