Vincenza Colonna


Vincenza Colonna


  +39 081 6132 254,, 

Population Genetics and Genetic Epidemiology

Keywords: population genomics

Genomics of early embryonic development

How natural selection acts on early human development

We investigate adverse outcomes of embryonic development like recurrent pregnancy loss and preeclampsia to identify the genetic factors that influence reproductive outcomes and pregnancy complications. This knowledge furthers our understanding of human evolution and informs efforts to improve pregnancy outcomes.

In CABERNET we aim to determine the extent of chromosomal mosaicism between embryonic and extraembryonic tissues using single cell DNA sequencing.
In DELIVER / NEONATE we want to identify genetic factors contributing to reproductive failure and recurrent miscarriage. We will use single cell strand sequencing to map balanced rearrangements and whole genome sequencing of euploid miscarried embryos to identify causative variants.
In CANDLE we want to uncover gene expression patterns associated with APOL1 risk alleles and preeclampsia in African American women. We will examine the role of ancestry in mediating the relationship between APOL1 genotype and preeclampsia risk. The results can provide insights into genetic and molecular basis of preeclampsia.

This project is in collaboration with


  • PRIN 2020J84FAM Ministero dell’Universita e della Ricerca

  • PRIN 2022WEHP7M Ministero dell’Universita e della Ricerca

  • PRIN P2022ZE75A Ministero dell’Universita e della Ricerca

  • Merigen Research s.r.l

  • POR Campania FSE 2014-2020 ASSE III – Ob. Sp. 14

Admixture Mapping and Pangenomics Analysis in the Biorepository and Integrative Genomics (BIG) Initiative Cohort

Improving Phenotype-Genotype Linkages through Ancestry Patterns

The BIG initiative is currently recruiting 30k participants from Memphis (TN), mostly children, with plans to include a total of 100,000 samples over the next five years. Nearly 45% of current participants are of African ancestry, and the cohort is partnering with the Genomic Information Commons, a consortium of top children’s hospitals, to conduct genomics research aimed at discovering the genetic foundations of human disease in diverse populations.

We want to understand the extent and origin of admixture patterns in the Biorepository and Integrative Genomics (BIG) Initiative cohort and to develop pipelines for admixture mapping of phenotypic traits related to diseases and natural selection phenotypes.


Global ancestry estimations for approximately 10,000 individuals in the BIG cohort (upper panel). Each vertical line represents an individual, and the colors indicate the proportion of different ancestries: African (blue), European (green), American (violet), and Asian (yellow). Based on visual inspection, five groups can be defined: individuals with exclusively African ancestry (a), those with European-African admixture (b), those with only European ancestry (c), those with American-European admixture (d), and those with predominantly Asian ancestry (e). Additionally, individuals with three- and four-way admixture are identified (f). For each group, a set of ten haplotypes is shown as an example, where local ancestry deconvolution has been performed, displaying ancestry tracts on chromosome 15. This information could be utilized for downstream analysis

We are using existing exome sequence data and electronic health records from 10k children to obtain an improved set of genetic variants, including structural variants through mapping against reference pangenomes and imputation.
We will reconstruct the demographic history of the individuals in the cohort at individual and population levels, determine the identity of the ancestral populations and deconvolute their contribution to the genome of single individuals. The ancestry information will be used to identify patterns of natural selection and map phenotypic traits.

This project is in collaboration with the Departments of Genetics, Genomics and Informatics, and Pediatrics at the University of Tennessee Health Science Center, in Memphis, TN

Human Pangenomics

A pangenome is a comprehensive collection of all the genetic variation present in a species, which overcomes the limitations of reference-based genomics by including both common and rare genetic variations in a single reference genome.

The Human Pangenome Reference Project aims to sequence 300 people to create a pangenome of 600 haplotypes and has currently released a first draft of the human pangenome reference based on 47 phased diploid assemblies from a group of genetically diverse individuals

Well-known population stratification is not visible in the p-arms of acrocentric chromosomes. This observation is compatible with recombination occurring between the p-arms of heterologous acrocentric chromosomes. Here is an example of chromosome 15. AFR: Africans; AMR: Native Americans; EAS: East-Asians; SAS: South-East Asians.

My team contributed to evaluate, for the first time, human population structure using markers from short arms of the acrocentric chromosomes [PMID: 37165242]. We found that markers from these regions have less power to distinguish populations compared to other regions. This is consistent with the understanding that short arms of acrocentric chromosomes undergo recombination between non-homologous chromosomes, similar to the X and Y pseudohomologous regions [PMID: 37165241]. Our findings on the patterns of linkage disequilibrium in these regions support this idea.

Mouse Pangenomics

Mice members of BXDs family have been inbred for 20-200 generations. They are of great value for mapping complex traits and phenome-wide association analyses. Current genomic studies on BXD assume a single linear reference genome, making it difficult to observe sequences diverging from the reference, therefore limiting the accuracy and completeness of analyses. Pangenome models overcome this limitation as they contain the full genomic information of a species.

We are building a reference pangenome for all extant members of all BXD families leveraging third generation and 10X sequence data. We will analyze the genetic variation in relation to thousands of phenotypes in the database.

(A.) odgi-vizlinear visualization of the pangenome of chromosome 19. Each line represents a haplotype. Line interruptions (white) are insertions in one or more strains, therefore deletions in the others (vertical white stripes). The left side is the centromere, the right side is the telomere.In these two regions sequences are fragmented. (B.) Extract of the pangenome from the Zfp91gene showing a 2,006 bp insertion found in DBA/2J and 48% of the BXD strains(green nodes in the graph). The insertion is in complete linkage with two other insertions of 4 bp and 135 bp in a region spanning 2.8 kbp. (C.) Strain-specific haplotypes (gray segments are not in scale)


The projects on pangenomics are in collaboration with the Department of Genetics, Genomics and Informatics, at the University of Tennessee Health Science Center, in Memphis, TN

Recombination between heterologous human acrocentric chromosomes.

Guarracino A, Buonaiuto S, de Lima LG, Potapova T, Rhie A, Koren S, Rubinstein B, Fischer C; Human Pangenome Reference Consortium; Gerton JL, Phillippy AM, Colonna V, Garrison E. Nature. 2023 May;617(7960):335-343. doi: 10.1038/s41586-023-05976-y. Epub 2023 May 10. PMID: 37165241
A draft human pangenome reference. Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK,
Monlong J, Abel HJ, Buonaiuto S, Chang XH, Cheng H, Chu J, Colonna V, [103 authors], Paten B. Nature. 2023 May;617(7960):312-324. doi: 10.1038/s41586-023-05896-x. PMID: 37165242; PMCID: PMC10172123.
Prioritization  of  putatively  detrimental  variants  in  euploid  miscarriages. Buonaiuto S, Biase ID, Aleotti V, Ravaei A, Marino A, Damaggio G, Chierici M, Pulijala M, D’AmbrosioP, Esposito G, Ayub Q, Furlanello C, Greco P, Capalbo A, Rubini M, Biase SD, Colonna V. Sci Rep.  2022 12(1):1997.   doi:10.1038/s41598-022-05737-3.  PMID: 35132093; PMCID: PMC8821623.
Efficient dynamic variation graphs  Eizenga J, Novak A, Kobayashi E, Villani F, Cisar C, Heumos S, Hickey G, Colonna V, Paten B, Garrison E Bioinformatics 2021 Jan 29;36(21):5139-5144. doi:10.1093/bioinformatics/btaa640. PMID: 33040146; PMCID: PMC7850124.
Genomic diversity and novel genome-wide association with fruit morphology in Capsicum, from 746kpolymorphic sites  Colonna V, D’Agostino N, Garrison E, Albrechtsen A, Jonas Meisner J, Facchiano A, Cardi T, TripodiP. Sci Rep. 2019 Jul 11;9(1):10067. doi: 10.1038/s41598-019-46136-5. PubMedPMID: 31296904
Positive selection in Europeans and East-Asians at the ABCA12 gene. Sirica R, Buonaiuto M, Petrella V, Sticco L, Tramontano D, Antonini D, Missero C, Guardiola O,Andolfi G, Kumar H, Ayub Q, Xue Y, Tyler-Smith C, Salvemini M, D’Angelo G, Colonna V Sci Rep. 2019 Mar 19;9(1):4843. doi:10.1038/s41598-019-40360-9. PubMed PMID: 30890716
Enrichment of low-frequency functional variants revealed by whole-genome sequencing of multiple isolated Europeanpopulations. Xue Y, Mezzavilla M, Haber M, McCarthy S, Chen Y, Narasimhan V, Gilly A, Ayub Q, Colonna V,Southam L, Finan C, Massaia A, Chheda H, Palta P, Ritchie G, Asimit J, Dedoussis G, Gasparini P,Palotie A, Ripatti S, Soranzo N, Toniolo D, Wilson JF, Durbin R, Tyler-Smith C, Zeggini E. Nat Commun. 2017 Jun 23;8:15927. PubMed PMID: 28643794
A global reference for human genetic variation. 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, KorbelJO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. Nature. 2015 Oct 1;526(7571):68-74. PubMed PMID: 26432245
Genetic landscape ofpopulations along the Silk Road: admixture and migration patterns. Mezzavilla M, Vozzi D, Pirastu N, Girotto G, d’Adamo P, Gasparini P, Colonna V. BMC Genet. 2014 Dec 5;15:131.PubMed PMID: 25476266
Human genomic regions with exceptionally high levels of popula-tion differentiation identified from 911 whole-genome sequences. Colonna V, Ayub Q, Chen Y, Pagani L, Luisi P, Pybus M, Garrison E, Xue Y, Tyler-Smith C; 1000Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, HandsakerRE, Kang HM, Marth GT, McVean GA.Genome Biol. 2014 Jun 30;15(6):R88.doi: 10.1186/gb-2014-15-6-r88. PubMed PMID: 24980144
Integrative annotation ofvariants from 1092 humans: application to cancer genomics. Khurana E*, Fu Y*, Colonna V*, Mu XJ*, Kang HM, Lappalainen T, Sboner A, Lochovsky L, Chen J,Harmanci A, Das J, Abyzov A, Balasubramanian S, Beal K, Chakravarty D, Challis D, Chen Y, ClarkeD, Clarke L, Cunningham F, Evani US, Flicek P, Fragoza R, Garrison E, Gibbs R, G ̈um ̈us ZH, HerreroJ, Kitabayashi N, Kong Y, Lage K, Liluashvili V, Lipkin SM, MacArthur DG, Marth G, Muzny D,Pers TH, Ritchie GR, Rosenfeld JA, Sisu C, Wei X, Wilson M, Xue Y, Yu F; 1000 Genomes ProjectConsortium, Dermitzakis ET, Yu H, Rubin MA, Tyler-Smith C, Gerstein M. Science. 2013 Oct 4;342(6154):1235587.doi: 10.1126/science.1235587. PubMed PMID: 24092746
Small effective population size andgenetic  homogeneity  in  the  Val  Borbera  isolateColonna V, Pistis G, Bomba L, Mona S, Matullo G, Boano R, Sala C, Vigan`o F, Torroni A, Achilli A,Hooshiar Kashani B, Malerba G, Gambaro G, Soranzo N, Toniolo D.Eur J Hum Genet. 2013 Jan;21(1):89-94.  PubMedPMID: 22713810
An integrated map of genetic variation from 1,092human genomes. 1000 Genomes Project Consortium, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM,Handsaker RE, Kang HM, Marth GT, McVean GA. Nature. 2012 Nov 1;491(7422):56-65. PubMed PMID: 23128226
IFITM3 restricts the morbidity and mortality associated with influenza. Everitt AR, Clare S, Pertel T, John SP, Wash RS, Smith SE, Chin CR, Feeley EM, Sims JS, Adams DJ,Wise HM, Kane L, Goulding D, Digard P, Anttila V, Baillie JK, Walsh TS, Hume DA, Palotie A, XueY,  Colonna V,  Tyler-Smith  C,  Dunning  J,  Gordon x SB;  GenISIS  Investigators;  MOSAIC  Investigators,Smyth RL, Openshaw PJ, Dougan G, Brass AL, Kellam P. Nature.  2012 Mar 25;484(7395):519-23.  PubMed PMID: 2244662
A world in a grain of sand: human history from geneticdata. Colonna V, Pagani L, Xue Y, Tyler-Smith C Genome Biol. 2011 Nov 21;12(11):234. PubMed PMID: 22104725
Human genome diversity: frequently asked questions. Barbujani G, Colonna V. Trends Genet. 2010Jul;26(7):285-95. PubMed PMID: 20471132

I am a genomicist and an expert in human evolutionary and population genomics and bioinformatics. 

I graduated in Evolutionary Biology from the University of Naples Federico II and did postdoctoral research at the University of Ferrara (Italy) and at Wellcome Trust Sanger Institute in Cambridge (UK). I am now leading the Population genomics laboratory  at the IGB-CNR (Naples, Italy) and I am Assistant Professor at the University of Tennessee, College of Medicine, in the Department of Genetics, Genomics and Informatics 

In my postdoctoral research I was part of the international consortium 1000 Genomes[PMID: 26432245; 23128226] where I led contributions to two specific aspects. First, I contributed to develop FunSeq [PMID: 24092746], a tool that integrates non-coding information from relevant biological databases for the functional characterization of non-coding variants. Second, I lead a genome-wide scan to identify genomic regions with exceptionally high levels of population differentiation [PMID: 24980144] demonstrating that these regions are enriched for positive selection events and that one half may be the result of classic selective sweeps. Findings from both sub-projects have since been applied to demographic inference and the molecular diagnosis of cancer and myeloid malignancies [PMID: 27121471, 22446628], and to deeper studies on positive selection at the ABCA12 gene [PMID: 30890716]. 

During my PhD I worked on human isolated populations contributing to characterize several isolated populations, describing the genomic consequences of isolation [PMID: 17476112, 19550436, 22713810], contributing to genetic association studies [PMID: 16611673, 18162505] and to characterize rare variation [PMID: 28643794]

Silvia Buonaiuto
Postdoctoral fellow
I have a PhD degree From the university Luigi Vanvitelli, a master’s degree in Biology from the University of Napoli Federico II. I did a master thesis in molecular biology at the Department of Biology.

I am currently involved in the DELIVER project, which aims to unravel unexplored genetic variations related to reproductive failure. In this project, my primary objective is to pinpoint genetic variants that may be responsible for recurrent spontaneous miscarriages, with the ultimate goal of enhancing prenatal diagnosis.
Additionally, I am contributing to the Admixture Mapping and Pangenomics Analysis in the Biorepository and Integrative Genomics (BIG) Initiative Cohort project. In this initiative, my role involves establishing connections between genotypes and phenotypes through the execution of burden analysis for rare variants and admixture mapping analysis.

Franco Marsico
Postdoctoral fellow
I earned my degree in Biology from the University of Buenos Aires, Argentina, where I also completed my PhD in Computational Biology at the Calculus Institute. My research primarily focused on developing mathematical models for kinship inference, employing a Bayesian Approach. I am a postdoc in the Colonna lab, where my work centers on population genomics.

I am currently working on Admixture Mapping and Pangenomics Analysis in the Biorepository and Integrative Genomics (BIG) Initiative Cohort project. My focus is on studying recent natural selection signals in admixed populations. Additionally, I have a deep interest in evolution and how to compute processes that shape the history of life.

Gianluca Damaggio
PhD student
I have just completed my Ph.D. in Biology at the University of Napoli Federico II and obtained a master’s degree in Molecular Biology from the same university.

My project, HD-DittoGraph - a digital human Embryonic Stem Cell platform for Hungtinton’s repeats, aims to develop the capability to accurately detect perturbations in short tandem repeats of the Huntington’s gene within proliferative cells, utilizing third-generation sequencing data. Throughout my Ph.D., I have also gained expertise in analyzing data from single-cell RNA-seq and ATAC-seq. Currently, I am a visiting student at the IGB-CNR in Naples and a Junior Research Fellow at the University of Milano Statale in Elena Cattaneo’s Laboratory.

Flavia Villani
PhD student
I obtained a Master’s degree in Medical Biotechnology from the University of Naples Federico II in Italy. Currently, I am a PhD candidate at the University of Tennessee Health Science Center in the Colonna lab.

I am working on the Mouse Pangenomics Project. My research aims to construct the pangenome graph of model organisms, specifically inbred mice and rats, using a combination of short and long-read sequence data. This will enable genome-wide association studies to be performed directly on the pangenome. Additionally, I have a strong interest in understanding how mobile genetic elements have driven genome evolution through various mechanisms.

Rosanna Maione
Research Associate
I have a master degree in Medical Biotechnology from the University of Naples Federico II and a master in Embryology. I am an expert in embryology and assisted human reproduction.

I am in charge of recruitment, sample collection and processing for the projects related to the genomics of early embryonic development. I am responsible for building the DNA and tissue collections.