Software and digital resources
All downloadable material listed on these pages - appended by specifics mentioned under the individual headers/chapters - is available for public use. Please note that while great care has been taken, the software, code and data are provided "as is" and that Department of Food Science, Section for Food Microbiology, Gut Health and Fermantation at UCPH does not accept any responsibility or liability.
Course material for handling 16s data using R with phyloseq and other packages
https://mortenarendt.github.io/MicrobiomeDataAnalysis/index.html
Sparse partial least squares regression and classification using a phylogenetic similarity penalty
ASCA and permutation testing for non-orthogonal designs
https://github.com/mortenarendt/ASCA
Ref:
Rasmussen, M.A., Khakimov, B., Engel, J. and Jansen, J., 2024. Permutation Strategies for Inference in ANOVA‐Based Models for Nonorthogonal Designs Including Continuous Covariates. Journal of Chemometrics, p.e3580. (See publication)
Vertical transfer of microbes using individual ASV analysis using a combined meta analysis statistics.
https://github.com/mortenarendt/VagTransfer
https://github.com/mortenarendt/MBtransfeR
Refs:
Rasmussen, M.A., Thorsen, J., Dominguez-Bello, M.G., Blaser, M.J., Mortensen, M.S., Brejnrod, A.D., Shah, S.A., Hjelmsø, M.H., Lehtimäki, J., Trivedi, U. and Bisgaard, H., 2020. Ecological succession in the vaginal microbiota during pregnancy and birth. The ISME journal, 14(9), pp.2325-2335. (See publication)
Mortensen, M.S., Rasmussen, M.A., Stokholm, J., Brejnrod, A.D., Balle, C., Thorsen, J., Krogfelt, K.A., Bisgaard, H. and Sørensen, S.J., 2021. Modeling transfer of vaginal microbiota from mother to infant in early life. Elife, 10, p.e57051. (See publication)
Data fusion of vastly different data-sources obtained on the same set of samples done by using kernel transformation coupled with graphical modelling using graphical LASSO.
https://github.com/mortenarendt/KerGLASSO
Refs:
Nørgaard, S.K., Linder‐Steinlein, K., Eliasen, A.U., Stokholm, J., Chawes, B.L., Bønnelykke, K., Bisggard, H., Smilde, A.K. and Rasmussen, M.A., 2021. On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets. Journal of Chemometrics, 35(10), p.e3324. (See publication)
Nørgaard, S.K., Følsgaard, N., Vissing, N.H., Kyvsgaard, J.N., Chawes, B., Stokholm, J., Smilde, A.K., Bønnelykke, K., Bisgaard, H. and Rasmussen, M.A., 2023. Novel Connections of Common Childhood Illnesses Based on More Than 5 Million Diary Registrations From Birth Until Age 3 Years. The Journal of Allergy and Clinical Immunology: In Practice, 11(7), pp.2162-2171. (See publication)
Bi-linear factorization of matrices with a generalized linear mapping (|Rb|1 < L) penality on the parameters.
https://github.com/mortenarendt/genL1
Ref:
Arendt Rasmussen, M., 2017. Generalized L1 penalized matrix factorization. Journal of Chemometrics, 31(4), p.e2855. See publication
BactFlow is a pipeline for bacterial genome assembly of single isolate and metagenomics sequencing reads extracted from Oxford Nanopore Technology (ONT) and Illumina platforms. It is designed using Nextflow DSL 2 technology and reads the generic outputs of Guppy and Dorado basecallers.
This workflow includes the necessary steps involved in the analysis of 16S rRNA microbiota amplicons data from raw sequences to publication-quality visualizations and statistical analysis. Non-cultured 16S rRNA metagenomics is a promising method for understanding the ecology of an environment in regards with the number and the structure of the microbiome in association with the environmental factors, e.g. host-microbiome interactions. In prokaryotes there is a ubiquitous gene compartment integrated in the ribosome, so-called 16S rRNA genes, which are highly conserved among prokaryotes and at the same time having hypervariable regions (HVRs) V1 to V9, which are good targets for evolutionary and ecological studies on prokaryotes Jünemann et. al (2017). This module is mainly focused on 16S rRNA gene data, but I can carefully say that you can apply most of the techniques explained here to genome data and count multivariate datasets. Note: all this workflow has been done on Jupyter notebook on a cluster node with 120 GB processer from Aarhus University, Denmark. In order to multitask in different nodes, tasks on Qiime2 have been summited to the cluster by separate bash scripts.
A shiny-app for interactive bioinformatics steps and statistical analysis on all count data especially 16S rRNA genes and Whole Genome (meta) genomics sequence analysis. The app includes all adjusting screws and buttons to help you translate sequence data into high resolution tables and plots. In other word, MicroLoop can do a task in less than a day which otherwise weeks might be required to accomplish.-
A python package to create html-based reports with possibilities of adding text, table, header, code chunks and responsive tables as well as plots.
An interactive R package to convert relative abundance of 16S rRNA data into their respective copy-numers via an internal Lamba Phage standard.
A function to create association network for microbiome data: bacteria-bacteria and feature-metabolite association. This function is dedicated to make graph/network based on the spearman (also pearson) correlation and the significant level of this correlation corrected for false dicorevy rate (FDR) by Benjamini-Hochberg (by default, other methods are also accepted. See the help sheet for p.ajust() function). This is a costume function and as it doens't count for partial effects of taxa, you must only use it for visualization and not for validation of associations. The function is also able to perfomr these analysis with and without Centered-Log ratio (CLR) transformation to account for difference in read depth. For the input matrix, you can simply use the phyloseq object and the function will do the rest. By default, the graph will be made from a dataframe, based on the most significantly correlated ASVs.
This package is compatible with biolecter XT model output which is an excel file with different sheets inside. The function takes directory to the excel file, a working directory for the output files, the number of sheets in the excel file (very important), and a binary (TRUE/FALSE) for the presence of biological (or technical) replicates. The funcition, then, generates timeseries plots of different filterset values over a range of speciefic time
NOTE: this package is only tuned for four filtersets, Biomass, pH, Riboflavine, and DO
Unlike 16S rRNA amplicons, shotgun metagenomics targets all DNA present in the sample, e.g. colon. This means your samples will contain DNA from bacteria, host, archeae, and DNA-virum. Therefore, in the first step the host DNA must be removed if it is not of your interest. After decontamination, short reads will be assembled to form Metagnomics Assembled Genomes (MAGs) or contigs. For taxonomic annotations MAGs were binned based on neucleotide identity (NI) threshold and will be blasted against the database. All these steps were done using ATLAS Snakmake workflow and the resultant was analysed as demostrated in this R markdown.