Data and scripts associated with a manuscript investigating dissolved organic matter and microbial community linkages across seven globally distributed rivers

  • Robert E Danczak (Creator)
  • Amy E Goldman (Creator)
  • Mikayla A. Borton (Creator)
  • Rosalie Chu (Creator)
  • Jason G Toyoda (Creator)
  • Vanessa A Garayburu-Caruso (Creator)
  • Emily B. Graham (Creator)
  • Joseph W Morad (Creator)
  • Lupita Renteria (Creator)
  • Jacqueline R Wells (Creator)
  • Shai Arnon (Creator)
  • Scott Brooks (Creator)
  • Edo Bar-Zeev (Creator)
  • Michael Jones (Creator)
  • Nikki Jones (Creator)
  • Jorg Lewandowski (Creator)
  • Christof Meile (Creator)
  • Birgit M. Muller (Creator)
  • Beck Powers-McCormack (Creator)
  • John Schalles (Creator)
  • Hanna Schulz (Creator)
  • Adam Ward (Creator)
  • James C Stegen (Creator)



This data package is associated with the publication “Meta-metabolome ecology reveals that geochemistry and microbial functional potential are linked to organic matter development across seven rivers” submitted to Science of the Total Environment. This data package includes the data necessary to replicate the analyses presented within the manuscript to investigate dissolved organic matter (DOM) development across broad spatial distances and within divergent biomes. Specifically, we included the Fourier transform ion cyclotron mass spectrometry (FTICR-MS) data, geochemistry data, annotated metagenomic data, and results from ecological null modeling analyses in this data package. Additionally, we included the scripts necessary to generate the figures from the manuscript.Complete metagenomic data associated with this data package can be found at the National Center for Biotechnology (NCBI) under Bioproject PRJNA946291.This dataset consists of (1) four folders; (2) a file-level metadata (flmd) file; (3) a data dictionary (dd) file; (4) a factor sheet describing samples; and (5) a readme. The FTICR Data folder contains (1) the processed Fourier transform ion cyclotron mass spectrometry (FTICR-MS) data; (2) a transformation-weighted characteristics dendrogram generated from the FTICR-MS data; and (3) the script used to generate all FTICR-MS related figures. The Geochemical Data folder contains (1) the single geochemistry data file and (2) the R script responsible for generating associated figures. The Metagenomic Data folder contains (1) annotation information across different levels; (2) carbohydrate active enzyme (CAZyme) information from the dbCAN database (Yin et al., 2012); (3) phylogenetic tree data (FASTAs, alignments, and tree file); and (4) the scripts necessary to analyze all of these data and generate figures. The Null Modeling Data folder contains (1) data generated during null modeling for each river and all rivers combined and (2) the R scripts necessary to process the data. All files are .csv, .pdf, .tsv, .tre, .faa, .afa, .tree, or .R.
Date made available2024
PublisherEnvironmental System Science Data Infrastructure for a Virtual Ecosystem
Geographical coverageThe shoreline of the Erpe River, Germany

Cite this