Skip to content

← Publications

2026 preprint methods bioRxiv (Cold Spring Harbor Laboratory)

A layered standards framework for integrating single-cell and spatial omics data into brain cell atlases

Patrick L. Ray, Jeremy A. Miller, Dorota Jarecka , Kimberly A. Smith, Pamela Baker, Lydia Ng, Maryann E. Martone, Puja Trivedi , Rashmie Abeysinghe, Lisa Anderson, Anita Bandrowski, Vieth Edyta, Ashwin A. Bhandiwad, Tek Raj Chhetri , Licong Cui, Michelle Giglio, Jeff Goldy, Na Hong, Hao Huang, Yan Huang, Yasmeen Hussain, Nelson Johansen, Mariah Kenney, Lauren Kruse, Xiaojin Li, James Meldrim, Tyler Mollenkopf, Suvarna Nadendla, David Osumi-Sutherland, Raymond Sanchez, Richard H. Scheuermann, Shiqiang Tao, Charles Vanderburg, Yuntao Yang, Alex Ropelewski, Shoaib Mufti, Ed Lein, Hua Xu, W Jim Zheng, Satrajit S. Ghosh, Owen White, Michael Hawrylycz, Guo-Qiang Zhang, Carol L. Thompson

Identifiers and access

DOI
10.64898/2026.04.30.722039
Cited by
0

Key findings

The BRAIN Initiative Cell Atlas Network introduces a three-layer standards framework — assay-agnostic modelling, harmonised metadata, and an extensible cell-type taxonomy — that turns heterogeneous single-cell and spatial omics datasets into interoperable, reusable brain-cell-atlas products.

Abstract

Source: openalex

Abstract The BRAIN Initiative Cell Atlas Network (BICAN) is generating large-scale multimodal datasets to profile cell types in the human, non-human primate, and mouse brain. The diversity of single-cell and spatial transcriptomic and epigenomic assays, combined with varied experimental contexts, multiple data-generating laboratories and distributed infrastructure, poses substantial challenges for data integration and reuse in BICAN. To address this, we implemented a standards framework that enables layered integration of these data into knowledge-ready products for interoperable brain cell atlases. This framework organizes data based on three progressively structured layers. First, we introduced an assay-agnostic modeling layer that unifies the representation of single-cell and spatial omics data using a common set of biological entities and processes assessed by diverse experimental techniques. Second, we implemented harmonized metadata standards that capture key experimental features linked to biospecimen provenance across heterogeneous tissue sources, species, and preparations, supporting integration and validation while minimizing burden on data contributors. Third, we present an extensible representation for data-driven cell type taxonomies that integrates molecular data with annotations, ontology mappings, and evidence. Together, these contributions represent an end-to-end framework that transforms heterogeneous datasets into structured, interoperable resources that support broad community reuse via mapping algorithms, annotation systems, and visualization platforms. This approach links biospecimen provenance with cell-level outputs and embeds these in a standardized taxonomy format, enabling downstream applications such as cross-dataset integration, reference mapping, and knowledge-driven analysis. More broadly, our work demonstrates a generalizable strategy for enabling an efficient data-to-knowledge pipeline in a large-scale consortium setting.

Topics

  • open-data-standards
  • reproducibility-tooling

Lab authors

This record was curated from the lab's CV, NCBI MyBibliography, and OpenAlex. See PROJECTS.md for how to add or correct an entry via a pull request.