Skip to content

Changing group levels

group_by is not so much a verb as it is a preposition. group_by does not change the contents or orders of any rows or columns, but it changes the context of subsequent verbs. All rows which have the same values in all group_by columns are members of the same subframe. Subsequent verbs act as if they are applied each subframe individually. So in filter or mutate, an expression containing max or mean will apply only to the rows in each subframe.

Tabeline is different from all popular data grammar libraries in how it handles groups. An instance of DataFrame has group levels. Each invocation of group_by adds one group level, which can contain any number of columns names by which to group. The flattened list of group names from all levels can be accessed via the group_names property.

group_by

Add a set of column names as a new group level. The column names must exist, and they must not be previously grouped.

from tabeline import DataFrame

df = DataFrame(
    id=["a", "a", "b", "b"],
    x=[1, 2, 3, 4],
)

df.group_by("id").mutate(mean="mean(x)")
# group levels: [id]
# shape: (4, 3)
# ┌─────┬─────┬──────┐
# │ id  ┆ x   ┆ mean │
# │ --- ┆ --- ┆ ---  │
# │ str ┆ i64 ┆ f64  │
# ╞═════╪═════╪══════╡
# │ a   ┆ 1   ┆ 1.5  │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ a   ┆ 2   ┆ 1.5  │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ b   ┆ 3   ┆ 3.5  │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ b   ┆ 4   ┆ 3.5  │
# └─────┴─────┴──────┘

ungroup

Drop the last group level.

The existence of group levels causes this to behave differently from dplyr. This does not remove all group names, only those present in the last group level.

from tabeline import DataFrame

df = DataFrame(
    id=["a", "a", "b", "b"],
    x=[1, 2, 3, 4],
)

df.group_by("id").mutate(mean="mean(x)")
# shape: (4, 3)
# ┌─────┬─────┬──────┐
# │ id  ┆ x   ┆ mean │
# │ --- ┆ --- ┆ ---  │
# │ str ┆ i64 ┆ f64  │
# ╞═════╪═════╪══════╡
# │ a   ┆ 1   ┆ 2.5  │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ a   ┆ 2   ┆ 2.5  │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ b   ┆ 3   ┆ 2.5  │
# ├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ b   ┆ 4   ┆ 2.5  │
# └─────┴─────┴──────┘