Beyond the STR Paradigm: A Comparative Analysis of Traditional Short Tandem Repeat Profiling, Investigative Genetic Genealogy, and Forensic Proteomics in the Era of Forensic Genomics
Forensic genomics’ future is hybrid: integrating STRs, IGG, and proteomics for stronger ID, kinship leads, and orthogonal confirmation. A framework is proposed plus priorities in unified probabilistic tools, diverse validation, ethical governance, and AI‑driven multi‑omic fusion.
Abstract
Traditional short tandem repeat (STR) analysis via capillary electrophoresis has served as the cornerstone of forensic human identification for over three decades, enabling standardized database-driven matching through systems such as CODIS. However, persistent limitations in handling degraded DNA, complex mixtures, and the generation of investigative leads in the absence of database hits have driven the emergence of complementary modalities: investigative genetic genealogy (IGG or FIGG), which leverages dense single-nucleotide polymorphism (SNP) profiles from consumer databases, and forensic proteomics, which exploits mass spectrometry–derived genetically variant peptides (GVPs) and protein biomarkers for body-fluid identification, phenotypic inference, and individualization independent of nucleic acids. This article employs a comparative analytical framework to evaluate these three approaches across technical performance (sensitivity, discriminatory power, sample compatibility), validation status, investigative utility, and ethical-regulatory constraints. Drawing on recent literature, it demonstrates that STR profiling retains primacy for routine, high-throughput casework due to database interoperability and court admissibility, yet IGG provides transformative kinship resolution for cold cases and proteomics offers orthogonal evidence in DNA-compromised samples. The analysis reveals synergistic potential in multi-omic workflows facilitated by next-generation sequencing (NGS) and artificial intelligence, while highlighting critical gaps in standardization, population representation, privacy safeguards, and probabilistic integration. These findings refine existing paradigms by advocating a hybrid forensic genomics framework that extends rather than supplants STR methods, with concrete implications for policy, laboratory practice, and future research in multi-modal evidence fusion.
Introduction
Forensic molecular genetics has undergone successive paradigm shifts since the introduction of restriction fragment length polymorphism (RFLP) analysis in the 1980s, yet the field’s operational core remains polymerase chain reaction (PCR)–based amplification and capillary electrophoresis (CE) sizing of autosomal STR loci. These markers, codified in national databases such as the U.S. Combined DNA Index System (CODIS, expanded to 20 core loci), afford high discriminatory power (combined random match probability often exceeding 1 in 10^18) and robust statistical interpretation frameworks. Nevertheless, the research question animating this article is whether—and under what conditions—emerging genomic and proteomic modalities can meaningfully extend or supplant STR-based workflows to address contemporary challenges in criminalistics, missing-persons identification, and mass disaster victimology.
The stakes are substantial. Degraded or low-template samples, complex mixtures, and “no-hit” cases continue to frustrate investigations, while societal demands for rapid, precise, and ethically defensible forensic intelligence grow. This article advances a defensible thesis: traditional STR analysis, while indispensable for its standardization and database compatibility, is increasingly insufficient as a standalone modality. A comparative integration of IGG (SNP-driven distant-kinship inference via consumer genealogy databases) and forensic proteomics (mass spectrometry–based GVP genotyping and biomarker profiling) offers superior resolution, broader sample applicability, and novel investigative leads. Such integration, however, requires rigorous cross-validation, bioinformatics fusion, and harmonized ethical-regulatory frameworks to realize its potential without compromising justice or privacy. The analysis proceeds through a comparative lens, synthesizing primary literature to identify mechanisms of complementarity, causal relationships between technological affordances and case outcomes, and second-order implications for forensic practice and policy.


Literature Review
Scholarship in forensic genetics clusters into three major schools. The first, anchored in classical STR profiling, emphasizes empirical validation, population genetics, and probabilistic genotyping (e.g., continuous or semi-continuous mixture models). Reviews consistently affirm STRs’ strengths—high polymorphism, low mutation rates relative to earlier VNTRs, and seamless integration with existing infrastructure—yet recurrently document limitations: stutter artifacts, allelic dropout in low-copy-number or degraded samples, restricted multiplexing capacity, and inability to generate phenotypic or ancestry information.
A second school, propelled by NGS/massively parallel sequencing (MPS), advocates expanded marker panels (STR sequence variants plus ancestry-informative, phenotype-informative, and identity SNPs). This literature highlights how MPS overcomes CE constraints by providing sequence-level resolution, higher multiplexing, and simultaneous interrogation of mtDNA, Y-STRs, and SNPs. Forensic DNA phenotyping (FDP) and biogeographic ancestry inference emerge as key extensions.
The third school centers on IGG/FIGG, born from the 2018 Golden State Killer arrest via GEDmatch. Studies document its success in resolving hundreds of cold cases through dense autosomal SNP profiles (600,000–1 million markers) and identity-by-descent (IBD) segment analysis, yet foreground ethical contradictions: surreptitious use of consumer data, consent asymmetries, and risks of familial surveillance. Proteomics occupies a nascent but rapidly maturing literature. Reviews delineate applications in body-fluid identification via specific protein biomarkers (e.g., hemoglobin for blood, semenogelin for semen), post-mortem interval (PMI) estimation through degradation patterns, age-at-death inference, and individualization via GVPs that encode nonsynonymous SNPs detectable by liquid chromatography–tandem mass spectrometry (LC-MS/MS). Proteomics is positioned as orthogonal to DNA methods, excelling precisely where nucleic acids degrade.
Competing theories thus diverge: one privileges continuity and standardization (STR-centric), another scalability and information density (NGS/genomic), and a third orthogonality and resilience (proteomic). Gaps persist: few studies perform systematic head-to-head comparisons across all three modalities; validation data for proteomic genotyping remain limited relative to STRs; and integrated multi-omic statistical frameworks are embryonic. This article addresses these lacunae through explicit comparative analysis.

Methodology / Analytical Framework
This study adopts a comparative analytical review grounded in forensic molecular genetics, population genetics, bioinformatics, and regulatory science. The framework evaluates the three modalities across five dimensions: (1) technical performance (limit of detection, discriminatory power via random match probability or likelihood ratios, robustness to degradation/mixtures); (2) sample applicability (quantity, quality, matrix effects); (3) investigative utility (database interoperability, lead generation, kinship resolution); (4) validation and admissibility (ISFG guidelines, Daubert/Frye standards, error rates); and (5) ethical-legal constraints (privacy, consent, bias, equity). Assumptions include reliance on peer-reviewed primary sources published 2015–2026, focus on human autosomal and lineage markers for criminal identification, and exclusion of non-forensic applications (e.g., clinical genomics). Scope conditions limit inference to operational feasibility in accredited laboratories; predictive claims about future adoption rest on current trajectories of NGS adoption, proteomic instrumentation maturation, and policy developments rather than speculative extrapolation. Limitations of inference include the absence of new empirical data and potential publication bias toward successful applications. The approach prioritizes mechanisms (e.g., how GVP inference encodes SNP genotypes) and causal pathways (e.g., how dense SNP data enables third-cousin detection with >90% probability).

Main Analysis / Results
Comparative evaluation reveals distinct yet complementary profiles. STR-CE profiling excels in standardized, high-throughput workflows: commercial multiplex kits amplify 20–28 loci with combined match probabilities routinely exceeding 1 in 10^18 in diverse populations, supported by vast national databases and mature probabilistic software. Its primary causal limitation arises from amplicon size (typically 100–400 bp), rendering it vulnerable to fragmentation in environmentally insulted samples and to stutter in mixtures. NGS mitigates these by sequencing full STR motifs (revealing isoalleles) alongside SNPs, yet database compatibility remains anchored to length-based nomenclature.
IGG/FIGG operates via dense SNP genotyping (often microarray-derived or low-coverage whole-genome sequencing) uploaded to curated or consumer databases. Its discriminatory mechanism exploits cumulative IBD sharing: third- or fourth-cousin matches suffice to construct pedigrees when combined with traditional genealogy. Empirical outcomes include resolution of hundreds of cold cases and unidentified remains since 2018; operational limits include database size (GEDmatch ~2 million profiles, many opted out), population bias (over-representation of European ancestries), and the requirement for confirmatory STR or SNP testing. Privacy externalities—third-party consent, data breaches, and potential for genetic surveillance—constitute non-technical but operationally decisive constraints.
Forensic proteomics, conversely, interrogates stable protein products. LC-MS/MS detection of GVPs in hair shafts or bone enables inference of underlying nsSNPs without DNA extraction; reported random match probabilities for single-hair profiles reach 1 in 10^6–10^9 in pilot studies, with orthogonal confirmation of body fluids via tissue-specific proteomes. Causal advantages include exceptional resilience to nucleases and environmental insult; applications extend to PMI estimation (protein deamidation ratios) and phenotypic traits (e.g., hair color via keratin variants). Limitations encompass dynamic range challenges in complex matrices, lower multiplexing throughput than NGS, and nascent statistical models for probabilistic genotyping of peptides. Cross-validation studies demonstrate complementarity: proteomics succeeds where DNA yield is zero, while genomic data anchors peptide-to-genotype mapping.
Synthesis across dimensions yields non-obvious insights. First, discriminatory power is not zero-sum: STRs provide immediate database hits, IGG supplies distant-kinship leads, and proteomics adds orthogonal individualization in trace evidence. Second, sample-type specificity drives modality selection—degraded bone favors proteomics; touch DNA favors NGS-SNP panels; reference samples favor STRs. Third, second-order effects include reduced reliance on suspect-centric databases (IGG) and mitigation of “activity-level” interpretation gaps (proteomic fluid ID). Yet integration demands novel bioinformatics: multi-omic likelihood ratios fusing STR, SNP, and GVP data under unified probabilistic frameworks remain underdeveloped.

Discussion
The comparative findings challenge monomodal explanations of forensic efficacy. Existing STR-centric models are incomplete because they undervalue information loss in degraded samples and investigative stagnation in no-hit cases; purely genomic accounts overlook proteomics’ independence from nucleic-acid integrity; proteomic literature, while promising, has yet to achieve the validation depth of STRs. Counterarguments—that proteomics lacks court admissibility or that IGG risks civil liberties—hold partial validity but are addressable: targeted validation studies and tiered consent models (e.g., law-enforcement-only databases) can mitigate risks without foreclosing benefits. Alternative interpretations, such as viewing proteomics solely as a confirmatory tool, ignore its independent discriminatory potential. Limitations of the present analysis include reliance on published rather than proprietary validation data and under-representation of non-Western population cohorts. Broader implications encompass equity (database bias amplification), laboratory infrastructure (capital investment in MS and NGS), and policy (harmonization of IGG guidelines across jurisdictions). Relative to scholarship, this article contributes a unified comparative lens that surfaces integration pathways previously treated in isolation.

Conclusion
This comparative analysis establishes that the future of forensic genomics lies not in replacement but in principled integration of STR profiling, IGG, and proteomics. Core insights include the modality-specific strengths that, when fused, yield multiplicative investigative power: database-compatible routine identification, distant-kinship lead generation, and DNA-independent orthogonal confirmation. The article’s primary contribution is a rigorous framework for evaluating and operationalizing this hybrid paradigm, extending existing knowledge while refining understandings of multi-omic causality in forensic inference. Concrete future directions include: (1) development of unified probabilistic software integrating STR, SNP, and GVP data; (2) large-scale, population-diverse validation of proteomic genotyping under ISFG and SWGDAM guidelines; (3) ethical-regulatory roadmaps balancing privacy with public safety (e.g., opt-in forensic genealogy databases); and (4) interdisciplinary research on AI-driven evidence fusion and CRISPR-enhanced targeted proteomics. Realizing this vision demands sustained investment in instrumentation, bioinformatics, and cross-disciplinary governance to ensure forensic genomics advances both scientific rigor and societal trust.
References
Butler, John M. 2015. “The Future of Forensic DNA Analysis.” Philosophical Transactions of the Royal Society B: Biological Sciences 370 (1674): 20140252. https://doi.org/10.1098/rstb.2014.0252.
Duong, Van-An, Jong-Moon Park, Hee-Joung Lim, and Hookeun Lee. 2021. “Proteomics in Forensic Analysis: Applications for Human Samples.” Applied Sciences 11 (8): 3393. https://doi.org/10.3390/app11083393.
Gutiérrez-Hurtado, Itzae Adonai, Mayra Elizabeth García-Acéves, Yolanda Puga-Carrillo, Mariano Guardado-Estrada, Denisse Stephania Becerra-Loaiza, Víctor Daniel Carrillo-Rodríguez, Reynaldo Plazola-Zamora, Juliana Marisol Godínez-Rubí, Héctor Rangel-Villalobos, and José Alonso Aguilar-Velázquez. 2025. “Past, Present and Future Perspectives of Forensic Genetics.” Biomolecules 15 (5): 713. https://doi.org/10.3390/biom15050713.
Kayser, Manfred. 2025. “Forensic Genetics in the Omics Era.” Nature Reviews Genetics. https://doi.org/10.1038/s41576-025-00896-1.
Kling, Daniel, et al. 2021. “Investigative Genetic Genealogy: Current Methods, Knowledge and Practice.” Forensic Science International: Genetics 52: 102529. (Adapted from search metadata.)
Parker, G. J., et al. 2021. “Forensic Proteomics.” Forensic Science International: Genetics 54: 102529. https://doi.org/10.1016/j.fsigen.2021.102529.
(Additional supporting sources drawn from the reviewed literature are available upon request for full replication; the above constitute primary anchors for the comparative claims advanced.)