Abstract

Functional genomics assays based on high-throughput sequencing greatly expand our ability to understand the genome. Here, we define the ENCODE blacklist- a comprehensive set of regions in the human, mouse, worm, and fly genomes that have anomalous, unstructured, or high signal in next-generation sequencing experiments independent of cell line or experiment. The removal of the ENCODE blacklist is an essential quality measure when analyzing functional genomics data.

Keywords

BlacklistENCODEGenomeGenomicsComputational biologyIdentification (biology)BiologyDNA sequencingHuman genomeFunctional genomicsGeneticsComputer scienceGeneWorld Wide Web

MeSH Terms

AnimalsComputational BiologyDatabasesGeneticGenomeGenomicsHumansSequence AnalysisDNASoftware

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
article
Volume
9
Issue
1
Pages
9354-9354
Citations
1883
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1883
OpenAlex
75
Influential

Cite This

Haley M. Amemiya, Anshul Kundaje, Alan P. Boyle (2019). The ENCODE Blacklist: Identification of Problematic Regions of the Genome. Scientific Reports , 9 (1) , 9354-9354. https://doi.org/10.1038/s41598-019-45839-z

Identifiers

DOI
10.1038/s41598-019-45839-z
PMID
31249361
PMCID
PMC6597582

Data Quality

Data completeness: 90%