group_split()
works like base::split()
but:
It uses the grouping structure from group_by()
and therefore is subject
to the data mask
It does not name the elements of the list based on the grouping as this
only works well for a single character grouping variable. Instead,
use group_keys()
to access a data frame that defines the groups.
group_split()
is primarily designed to work with grouped data frames.
You can pass ...
to group and split an ungrouped data frame, but this
is generally not very useful as you want have easy access to the group
metadata.
# S3 method for class 'SummarizedExperiment'
group_split(.tbl, ..., .keep = TRUE)
A tbl.
If .tbl
is an ungrouped data frame, a grouping specification,
forwarded to group_by()
.
Should the grouping columns be kept?
A list of tibbles. Each tibble contains the rows of .tbl
for the
associated group and all the columns, including the grouping variables.
Note that this returns a list_of which is slightly
stricter than a simple list but is useful for representing lists where
every element has the same type.
group_split()
is not stable because you can achieve very similar results by
manipulating the nested column returned from
tidyr::nest(.by =)
. That also retains the group keys all
within a single data structure. group_split()
may be deprecated in the
future.
Hutchison, W.J., Keyes, T.J., The tidyomics Consortium. et al. The tidyomics ecosystem: enhancing omic data analyses. Nat Methods 21, 1166–1170 (2024). https://doi.org/10.1038/s41592-024-02299-2
Wickham, H., François, R., Henry, L., Müller, K., Vaughan, D. (2023). dplyr: A Grammar of Data Manipulation. R package version 2.1.4, https://CRAN.R-project.org/package=dplyr
Other grouping functions:
group_by()
,
group_map()
,
group_nest()
,
group_trim()
data(pasilla, package = "tidySummarizedExperiment")
pasilla |> group_split(condition)
#> [[1]]
#> class: SummarizedExperiment
#> dim: 14599 4
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(14599): FBgn0000003 FBgn0000008 ... FBgn0261574 FBgn0261575
#> rowData names(0):
#> colnames(4): untrt1 untrt2 untrt3 untrt4
#> colData names(2): type condition
#>
#> [[2]]
#> class: SummarizedExperiment
#> dim: 14599 3
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(14599): FBgn0000003 FBgn0000008 ... FBgn0261574 FBgn0261575
#> rowData names(0):
#> colnames(3): trt1 trt2 trt3
#> colData names(2): type condition
#>
pasilla |> group_split(counts > 0)
#> [[1]]
#> class: SummarizedExperiment
#> dim: 5536 7
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(5536): FBgn0000003 FBgn0000015 ... FBgn0261508 FBgn0261514
#> rowData names(0):
#> colnames(7): untrt1 untrt2 ... trt2 trt3
#> colData names(3): condition type counts > 0
#>
#> [[2]]
#> class: SummarizedExperiment
#> dim: 12359 7
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(12359): FBgn0000008 FBgn0000014 ... FBgn0261401 FBgn0261568
#> rowData names(0):
#> colnames(7): untrt1 untrt2 ... trt2 trt3
#> colData names(3): condition type counts > 0
#>
pasilla |> group_split(condition, counts > 0)
#> [[1]]
#> class: SummarizedExperiment
#> dim: 5271 4
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(5271): FBgn0000003 FBgn0000015 ... FBgn0260968 FBgn0261356
#> rowData names(0):
#> colnames(4): untrt1 untrt2 untrt3 untrt4
#> colData names(3): type condition counts > 0
#>
#> [[2]]
#> class: SummarizedExperiment
#> dim: 11886 4
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(11886): FBgn0000008 FBgn0000014 ... FBgn0261361 FBgn0261523
#> rowData names(0):
#> colnames(4): untrt1 untrt2 untrt3 untrt4
#> colData names(3): type condition counts > 0
#>
#> [[3]]
#> class: SummarizedExperiment
#> dim: 4990 3
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(4990): FBgn0000003 FBgn0000022 ... FBgn0261508 FBgn0261514
#> rowData names(0):
#> colnames(3): trt1 trt2 trt3
#> colData names(3): type condition counts > 0
#>
#> [[4]]
#> class: SummarizedExperiment
#> dim: 11730 3
#> metadata(2): latest_mutate_scope_report latest_select_scope_report
#> assays(1): counts
#> rownames(11730): FBgn0000008 FBgn0000014 ... FBgn0261401 FBgn0261568
#> rowData names(0):
#> colnames(3): trt1 trt2 trt3
#> colData names(3): type condition counts > 0
#>