Implements an entropy-informed pipeline for detecting emerging variants in viral amino acid sequence data, extending prior clustering-based approaches including hemagglutinin clustering methods (Li et al., 2015) <doi:10.1142/9789814667944_0018>. Provides a fully vectorized FASTA preprocessing toolkit covering header parsing, two-pass date and country extraction, ambiguous-residue filtering, and integer encoding under a 25-symbol amino acid alphabet. Computes per-site Shannon entropy across user-defined cumulative, sliding, or disjoint temporal partitions and clusters per-site entropy values using Gaussian mixture models via 'mclust' (Scrucca et al., 2016) <doi:10.32614/RJ-2016-021>. Quantifies temporal distributional shifts between partitions using the Hellinger distance (van der Vaart, 1998) <doi:10.1017/CBO9780511802256>, and detects temporal change points non-parametrically using energy statistics (Matteson and James, 2014) <doi:10.1080/01621459.2013.849605> via 'ecp' or wild binary segmentation (Fryzlewicz, 2014) <doi:10.1214/14-AOS1245> via 'HDcpDetect'. Per-site amino-acid frequency tables and entropy trajectory plots characterize sequence composition and evolutionary dynamics across time. A configurable multi-variant simulation engine generates synthetic sequence time series with known ground truth for benchmarking detection pipelines. A curated dataset of SARS-CoV-2 Variants of Concern and Variants of Interest with associated lineage and surveillance metadata is included, along with a bundled National Center for Biotechnology Information (NCBI) Spike protein sample and vignettes demonstrating the full workflow.
| Version: | 0.6.2 |
| Depends: | R (≥ 3.5.0) |
| Imports: | ggplot2 (≥ 3.4.0), grDevices, HDcpDetect, ecp, kableExtra, lubridate, magrittr, mclust, rlang, stats, stringr, utils, zoo |
| Suggests: | Biostrings, DT, dplyr, here, knitr, readxl, rmarkdown, R.rsp, testthat (≥ 3.0.0) |
| Published: | 2026-05-30 |
| DOI: | 10.32614/CRAN.package.ViralEntropR (may not be active yet) |
| Author: | Vadim Tyuryaev |
| Maintainer: | Vadim Tyuryaev <vadim.tyuryaev at gmail.com> |
| BugReports: | https://github.com/vadimtyuryaev/ViralEntropR/issues |
| License: | MIT + file LICENSE |
| URL: | https://github.com/vadimtyuryaev/ViralEntropR, https://doi.org/10.5281/zenodo.19040165, https://vadimtyuryaev.github.io/ViralEntropR/ |
| NeedsCompilation: | no |
| Language: | en-GB |
| Materials: | README, NEWS |
| CRAN checks: | ViralEntropR results |
| Package source: | ViralEntropR_0.6.2.tar.gz |
| Windows binaries: | r-devel: not available, r-release: not available, r-oldrel: not available |
| macOS binaries: | r-release (arm64): ViralEntropR_0.6.2.tgz, r-oldrel (arm64): ViralEntropR_0.6.2.tgz, r-release (x86_64): ViralEntropR_0.6.2.tgz, r-oldrel (x86_64): ViralEntropR_0.6.2.tgz |
Please use the canonical form https://CRAN.R-project.org/package=ViralEntropR to link to this page.