Animal Stem cells probably appeared 700-800 million years ago. We found that two key molecules that make our stem cells the way they are even older than that as they even existed before the evolution of animals. 1/n
Perspectives on Codebook: sequence specificity of uncharacterized human transcription factors biorxiv.org/content/10.1101/โฆ
โถ๏ธThis preprint provides a summary of the release of a bunch of new preprints by the Codebook Consortium (see three posts below). #TFbinding
ALT We describe an effort ("Codebook") to determine the sequence specificity of 332 putative and largely uncharacterized human transcription factors (TFs), as well as 61 control TFs. Nearly 5,000 independent experiments across multiple in vitro and in vivo assays produced motifs for just over half of the putative TFs analyzed (177, or 53%), of which most are unique to a single TF. The data highlight the extensive contribution of transposable elements to TF evolution, both in cis and trans, and identify tens of thousands of conserved, base-level binding sites in the human genome. The use of multiple assays provides an unprecedented opportunity to benchmark and analyze TF sequence specificity, function, and evolution, as further explored in accompanying manuscripts. 1,421 human TFs are now associated with a DNA binding motif. Extrapolation from the Codebook benchmarking, however, suggests that many of the currently known binding motifs for well-studied TFs may inaccurately describe the TF's
Cross-platform DNA motif discovery and benchmarking to explore binding specificities of poorly studied human transcription factors (Vorontsov et al, 2024) biorxiv.org/content/10.1101/โฆ
โถ๏ธCodebook: 4,237 experiments, 394 human proteins
โถ๏ธCodebook Motif Explorer mex.autosome.org
ALT A DNA sequence pattern, or "motif", is an essential representation of DNA-binding specificity of a transcription factor (TF). Any particular motif model has potential flaws due to shortcomings of the underlying experimental data and computational motif discovery algorithm. As a part of the Codebook/GRECO-BIT initiative, here we evaluated at large scale the cross-platform recognition performance of positional weight matrices (PWMs), which remain popular motif models in many practical applications. We applied ten different DNA motif discovery tools to generate PWMs from the "Codebook" data comprised of 4,237 experiments from five different platforms profiling the DNA-binding specificity of 394 human proteins, focusing on understudied transcription factors of different structural families. For many of the proteins, there was no prior knowledge of a genuine motif. By benchmarking-supported human curation, we constructed an approved subset of experiments comprising about 30% of all experim.
GHT-SELEX demonstrates unexpectedly high intrinsic sequence specificity and complex DNA binding of many human transcription factors (Jolma et al., 2024) biorxiv.org/content/10.1101/โฆ
โถ๏ธTFs possess sufficient intrinsic specificity to independently delineate cellular targets
Extensive binding of uncharacterized human transcription factors to genomic dark matter (Razavi et al., 2024) biorxiv.org/content/10.1101/โฆ
โถ๏ธ166 uncharacterized human TFs
โถ๏ธ"Dark TFs", mainly bind closed chromatin enriched for transposable elements
โถ๏ธSome Dark TFs contain KRAB domain
ALT Most of the human genome is thought to be non-functional, and includes large segments often referred to as "dark matter" DNA. The genome also encodes hundreds of putative and poorly characterized transcription factors (TFs). We determined genomic binding locations of 166 uncharacterized human TFs in living cells. Nearly half of them associated strongly with known regulatory regions such as promoters and enhancers, often at conserved motif matches and co-localizing with each other. Surprisingly, the other half often associated with genomic dark matter, at largely unique sites, via intrinsic sequence recognition. Dozens of these, which we term "Dark TFs", mainly bind within regions of closed chromatin. Dark TF binding sites are enriched for transposable elements, and are rarely under purifying selection. Some Dark TFs are KZNFs, which contain the repressive KRAB domain, but many are not: the Dark TFs also include known or potential pioneer TFs...
CTCF binding landscape is established by the epigenetic status of the nucleosome, well-positioned relative to CTCF motif orientation biorxiv.org/content/10.1101/โฆ
The registration deadline for our FREE conference in Colchester is prolonged until 1st September. The conference itself is on 16th September. Short talk slots available, hurry up!
We are organising a conference "Genomics in Ageing and Disease", 16th September 2024, University of Essex, Colchester. More details soon gate.essex.ac.uk/
We have an opening for a three-year postdoctoral position for a researcher with expertise in machine learning and genome bioinformatics, to work on cancer diagnostics (liquid biopsies). Please apply here: vacancies.essex.ac.uk/tlive_โฆ
We are organising a conference "Genomics in Ageing and Disease", 16th September 2024, University of Essex, Colchester. More details soon gate.essex.ac.uk/
D-box-binding protein alleviates vascular calcification in rats with chronic kidney disease by activating microRNA-195-5p and downregulating cyclin D1 dlvr.it/T16blq