Hi-C: High-throughput chromosome conformation capture – connect sequence and structure
Hi-C (Proximity Ligation) is one of a suite of chromosome conformation capture (or 3C) techniques originally devised to study the spatial organization of chromatin. Hi-C employs cost-effective, high-throughput, short-read sequencing to identify the nucleotide sequences of genomic loci that are co-located in three-dimensional space, but may be separated by significant distances in the linear genome. This powerful methodology has since enabled significant improvements in genome assembly (of humans and other species), as well as structural variant and epigenetic analysis, and has unlocked many applications in metagenomics and microbiology.
New: The open source software FALCON-Phase combines Hi-C data with the high accuracy, long-read sequencing data from PacBio® to create haplotype-resolved genome sequences on a chromosomal scale, even without having parental genome data (see Nature Communications, April 21, 2021).
How Proximity Ligation (Hi-C) works: Chimeric junctions between adjacent sequences encode quantitative, long-range information
DNA is crosslinked in vivo to fix all of the DNA contained in the cell. Crosslinking traps sequences that are in close proximity to one another, across the entire genome and between different chromosomes. In microbes, interactions between genomic DNA and mobile genetic elements (e.g. plasmids and transposons) are also captured. Crosslinked DNA is subsequently fragmented with endonucleases. Fragmented loci are biotinylated and proximity ligated, creating chimeric junctions between adjacent sequences. Biotinylated junctions are purified and subjected to deep, paired-end sequencing.
The information contained in the chimeric junctions is not limited to sequence (position in the linear genome), but can be decoded to reveal the physical origin of each junction partner in the three-dimensional structure of the DNA. Proximity ligation reads are mapped against a draft assembly or shotgun data to improve the quality and reliability of complex genomes and metagenomes. This in turn, enables improved insights in many areas of biology and medicine.