Rethinking junk DNA redux

The purpose of this thread:

  1. Share interesting findings with regards to non-coding elements of the genome.
  2. Discuss the functionality of non-coding elements
  3. How non-coding elements affect evolutionary trajectories

The term “junk DNA” does not imply that junk DNA is actually functionless junk. It merely applies to DNA that is a provisionally labeled for sequences of DNA for which no function has been identified. Nothing more, nothing less.

Those interested can read up here about recent finding regarding these elements:
‘Junk’ DNA Has Important Role, Researchers Find
Saved By Junk DNA: Vital Role In The Evolution Of Human Genome
Junk DNA may have handed us a gripping future
Shaking up the theory of evolution
RNAs Taking Center Stage
Spare Gene Is Fodder For Fishes’ Evolution
Transposons, or Jumping Genes: Not Junk DNA?

Non-coding RNAs is an interesting group of molecules.

Mattick and Makunin I think adequately describe ncRNAs as:

The term non-coding RNA (ncRNA) is commonly employed for RNA that does not encode a protein

There are several classes of ncRNAs and Mattick and Makunin describe a few of them:
These include:

  1. microRNAs
  2. snoRNA
  3. Sense and antisense transcripts
  4. Other

1.5-2% of our genomes code for proteins, leaving 98.5-98% of the genome to be assigned as non-coding.

It is interesting to note, however, that about >90% of our genomes are in fact transcribed into RNA.
The discovery of eukaryotic genome design and its forgotten corollary—the postulate of gene regulation by nuclear RNA

We now know that much of the genome of creatures like us is copied into RNA. Earlier methods missed this, in part because only the RNA coming from so-called single-copy DNA elements was scored and also because today’s methods are 100–10,000 times more sensitive. The modern tally says that >90% of the genome is copied into RNA (although the current methods do not always define whether these arise from bona fide transcription start sites as opposed to random RNA polymerase binding to DNA).

Here is a nice ppt describing non-coding RNAs.
Two main classes of ncRNAs:
Housekeeping and Regulatory.

Since we know that >90% of the genome is copied into RNA, it is interesting to note that research suggests the following:

No Such Thing As ‘Junk RNA,’ Say Researchers

ScienceDaily (Oct. 12, 2009) — [B]Tiny strands of RNA previously dismissed as cellular junk are actually very stable molecules that may play significant roles in cellular processes[/B], according to researchers at the University of Pittsburgh School of Medicine and the University of Pittsburgh Cancer Institute (UPCI).
The findings, published last week in the online version of the Journal of Virology, represent the first examination of very small RNA products termed unusually small RNAs (usRNAs). Further study of these usRNAs, which are present in the thousands but until now have been neglected, could lead to new types of biomarkers for diagnosis and prognosis, and new therapeutic targets.

In recent years, scientists have recognized the importance of small RNAs that generally contain more than 20 molecular units called nucleotides, said senior author Bino John, Ph.D., assistant professor, Department of Computational Biology, Pitt School of Medicine.

“But until we did our experiments, we didn’t realize that RNAs as small as 15 nucleotides, which we thought were simply cell waste, are surprisingly stable, and are repeatedly, reproducibly, and accurately produced across different tissue types.” Dr. John said. “We have dubbed these as usRNAs, and we have identified thousands of them, present in a diversity that far exceeds all other longer RNAs found in our study.”

The team’s experiments began with the observation that the Kaposi sarcoma-associated herpesvirus produces a usRNA that can control the production of a human protein. Detailed studies using both computational and experimental tools revealed a surprisingly large world of approximately 15 nucleotide-long usRNAs with intriguing characteristics. Many usRNAs interact with proteins already known to be involved in small RNA regulatory pathways. Some also share highly specific nucleotide patterns at one end. The researchers wrote that the existence of several different patterns in usRNAs reflects the diverse pathways in which the RNAs participate.

“These findings suggest that usRNAs are involved in biological processes, and we should investigate them further,” Dr. John noted. “They may be valuable tools to diagnose diseases, or perhaps they could present new drug targets.”

In addition to exploring biomarker potential, he and his colleagues plan to better characterize the various subclasses of usRNAs, identify their protein partners and study how they are made in the cell.

Co-authors of the paper include Zhihua Li, Ph.D., Sang Woo Kim, Ph.D., Yuefeng Lin, of the Department of Computational Biology; Patrick S. Moore, M.D., M.P.H, Department of Microbiology and Molecular Genetics and the Molecular Virology Program, UPCI; and Yuan Chang, M.D., Molecular Virology Program, UPCI.

This research was supported by grants from the National Institute of General Medicine Sciences and the National Cancer Institute, the American Cancer Society, the Pennsylvania Department of Health and the University of Pittsburgh.

A new type of RNA molecule, usRNA. And it has a regulatory role to play.

The original thread was unfortunately moved to the flame war section, partly due to my own fault. Sincere apologies for that. Let’s have a constructive conversation about junk DNA and RNA :).

Some honesty at last! When are you going to come clean about your agenda?

How times have changed.

1) Dr. Susumu Ohno coined the term junk DNA in 1972 in his article So Much ‘Junk DNA’ in our Genome.
2) In 1976, Dawkins published his selfish gene idea in his book, The Selfish Gene.
The idea: Genomic DNA can be accounted for by two ways:
A) Specific functions of sequences contribute to phenotypic fitness and was thus selected
B) DNA sequences that do not contribute to fitness are parasitic elements that replicate themselves without any evolutionary function. Selfish elements. These selfish elements were described as non-coding and non-specific sequences, including repetitive sequences, transposons and other degenerate elements.

3) In 1980, the following article appeared:
Selfish genes, the phenotype paradigm and genome evolution.

Natural selection operating within genomes will inevitably result in the appearance of DNAs with no phenotypic expression whose only 'function' is survival within genomes. Prokaryotic transposable elements and eukaryotic middle-repetitive sequences can be seen as such DNA's and [B]thus no phenotypic or evolutionary function need be assigned to them.[/B]
  1. Dawkins in this article had the following to say about junk DNA:
Genomes are littered with [B]nonfunctional pseudogenes, faulty duplicates of functional genes that do nothing[/B], while their functional cousins (the word doesn’t even need scare quotes) get on with their business in a different part of the same genome. [B]And there’s lots more DNA that doesn’t even deserve the name pseudogene.[/B] It, too, is derived by duplication, but not duplication of functional genes. [B]It consists of multiple copies of junk, tandem repeats, and other nonsense which may be useful for forensic detectives but which doesn’t seem to be used in the body itself.[/B]
  1. At talk origins, there is this article from 2003:
    Fascinating to read the thoughts on junk DNA. For example:
[B]5.4 Important roles have been found for DNA regions previously thought to be functionless[/B]

At a recent debate with me Dr. Gish cited a review in Science entitled Mining treasures from ‘junk’ DNA (263:608, 1994), seeming to imply that this review suggests functions for pseudogenes and retroposons that would be consistent with the creationist view that they were designed to function similarly in similar species. In fact, this review discusses evidence for possible functions of centromeric and telomeric repetitive sequences, minisatellites, introns and 3’ untranslated regions. It mentions pseudogenes and retroposons but makes no suggestion that these particular elements have function, so this review offers no argument against the points made in this essay. Nevertheless, since there have been other speculations about possible functions for DNA outside gene coding sequences, it is worth considering why scientists generally accept the notion that most of this DNA is junk.

First, we know several mechanisms by which DNA length can be increased through genetic accidents such as DNA duplications and insertion of retroposons, which have been observed in the lab or occurring in humans without apparent effects; so it is reasonable to suppose that these mechanisms operated in the past to increase genome size without affecting function. There appears to be little or no selective pressure to reduce the size of vertebrate nuclear genomes; and there is no apparent mechanism to selectively eliminate useless DNA. Large deletions that eliminate functional DNA are selected against. These observations would predict the accumulation of useless DNA as the result of random genetic accidents, so when we see DNA that seems non-functional, we shouldn’t necessarily assume that it has function that we don’t understand.

Second, when DNA sequence is compared between species like human versus mouse, sequences that are known to have function – coding sequences of genes in particular – are found to be highly similar, consistent with selective pressure that weeds out individuals that have deleterious mutations in these functional regions. Conversely, DNA regions with no known function – e.g. non-coding sequences between genes – generally behave as if they are under no selective pressure, that is they apparently accumulate mutations at a much higher rate so there is little sequence conservation between distantly related species. As an exception that probes the rule, comparisons of non-coding sequence across species occasionally detect islands of short conserved sequence in non-coding regions. Some of these have turned out to correspond to regulatory regions like promoter or enhancer elements that control when a nearby gene is expressed. An example of such an island conserved between rabbit, mouse and human was discovered in my own lab [Emorine et al., Nature 304:447, 1983]; it turned out to represent an important enhancer. These kinds of regulatory regions generally take up much less DNA than the coding sequences of the genes they regulate, so they cannot represent a likely function for most non-coding DNA. The good correlation between function and sequence conservation lends support to the idea that most poorly conserved sequences do not have function. However, it should be noted that for most of the islands of conserved sequence in DNA between genes (Shabalina et al., Trends Genet 17:373, 2001), no function has yet been discovered. Some may include RNA species that function without being translated into protein.

A third but related argument derives from the observation that the insertion of a retroposon into a functional sequence is a potent way to destroy that function. Examples of naturally occurring insertions were discussed in section 5.2 above; and intentional retroposon insertion is being widely used as a laboratory tool to create panels of mouse, drosophila or yeast strains with different gene functions destroyed. However, most examples of retroposon insertions between genes do not have any apparent affect on individuals harboring them; for example the Alu sequences that are polymorphic in human DNA appear to be harmless when present. Therefore, it is reasonable to infer that these insertions did not interrupt any functional sequence. (Of course it is impossible to rule out the formal possibility that some hypothetical functional sequences outside genes can still function despite the presence of a retroposon insertion.)

Finally, several examples are known of pairs of species that have similar apparent complexity but widely different genome size (C-value paradox). The pufferfish Fugu has about one fourth the genome size of other fish species but about the same number of genes. The main difference is a smaller amount of DNA between genes in Fugu DNA (e.g. see Elgar et al. Genome Res 9:960, 1999). Although questions remain about the interpretation of this difference, it would seem that much of the DNA between genes in most fish genomes (and probably in ours also) is dispensable. (Conversely, the small regions of non-coding sequence that are conserved between Fugu and Homo frequently correspond to functional regulatory sequences.)

It is impossible to prove absence of function for any region of DNA. Moreover, it is likely that some function may be found for a few additional short regions of non-coding DNA that are not currently recognized to have function. Nevertheless, as indicated above, scientists draw tentative conclusions based on data currently at hand rather than on hypothetical possibilities of future data; and the arguments I just presented based on presently available evidence suggest that most DNA sequences that appear to be functionless are just that.

"it is worth considering why scientists generally accept the notion that most of this DNA is junk,"
looks like there was a bit of a consensus…

"we shouldn’t necessarily assume that it has function that we don’t understand."
That turned out to be an unscientific thing to do…

"Nevertheless, as indicated above, scientists draw tentative conclusions based on data currently at hand rather than on hypothetical possibilities of future data; and the arguments I just presented based on presently available evidence suggest that most DNA sequences that appear to be functionless are just that."
Yes, appearances were a bit deceiving it seems.

Back in 2009:
1.5-2% of our genomes code for proteins, leaving 98.5-98% of the genome to be assigned as non-coding.

About >90% of our genomes are in fact transcribed into RNA (non-coding).

The discovery of eukaryotic genome design and its forgotten corollary—the postulate of gene regulation by nuclear RNA

We now know that much of the genome of creatures like us is copied into RNA. Earlier methods missed this, in part because only the RNA coming from so-called single-copy DNA elements was scored and also because today’s methods are 100–10,000 times more sensitive. The modern tally says that >90% of the genome is copied into RNA (although the current methods do not always define whether these arise from bona fide transcription start sites as opposed to random RNA polymerase binding to DNA).

And scientists discovered that:

No Such Thing As ‘Junk RNA,’ Say Researchers

The selfish gene/parasitic DNA idea is a bit stranded at the moment it seems.

More functions of junk DNA…
Junk DNA Mechanism That Prevents Two Species From Reproducing Discovered

ScienceDaily (Oct. 27, 2009) — Cornell researchers have discovered a genetic mechanism in fruit flies that prevents two closely related species from reproducing, a finding that offers clues to how species evolve.

When two populations of a species become geographically isolated from each other, their genes diverge from one another over time. Eventually, when a male from one group mates with a female from the other group, the offspring will die or be born sterile, as a cross between a horse (left) and a donkey (right) produce a sterile mule. At this point, they have become two distinct species. (Credit: iStockphoto)

When two populations of a species become geographically isolated from each other, their genes diverge from one another over time.

Eventually, when a male from one group mates with a female from the other group, the offspring will die or be born sterile, as crosses between horses and donkeys produce sterile mules. At this point, they have become two distinct species.

Now, Cornell researchers report in the October issue of Public Library of Science Biology (Vol. 7, No. 10) that rapidly evolving “junk” DNA may create incompatibilities between two related species, preventing them from reproducing. In this case, the researchers studied crosses between closely related fruit flies, Drosophila melanogaster and D. simulans. Nearly 100 years ago, scientists discovered that when male D. melanogasters mate with female D. simulans, normal males survive, but the female embryos die.

“It has remained an unsolved problem,” said Patrick Ferree, the paper’s lead author and a postdoctoral researcher in the lab of co-author Daniel Barbash, an assistant professor of molecular biology and genetics. “The question is, what are the elements that are killing these female hybrids and how are they doing that?”

The researchers found that the female hybrid embryos died very early in development. In most species, when the male’s sperm (carrying either an X or Y chromosome) fertilizes the female’s egg (containing an X chromosome), a new cell forms with a single nucleus containing a sex chromosome from each parent. If the offspring inherits its father’s X chromosome, it becomes female; if it inherits a Y chromosome, it becomes male. Ferree and Barbash found that a unique segment of DNA in the father’s X chromosome leads to embryo death of hybrid females.

The segment of DNA was found in the chromosome’s heterochromatin, a densely packed region of highly repetitive sequences of junk DNA near the chromosome’s center.

During the embryo’s initial divisions, the researchers found, a specific segment of heterochromatin gets “sticky” and halts the process, preventing the entire X chromosome from separating properly; the result is that the early embryo dies.

Researchers have known that DNA in heterochromatin evolves faster than in other parts of the genome. Also, during early development, the proteins required for cell division come from the mother. The researchers speculate that the heterochromatin of the male D. melanogaster’s X chromosome has rapidly evolved, such that after mating, the machinery involved in DNA packaging from a D. simulans mother no longer recognizes the D. melanogaster father’s “junk” DNA, Ferree said.

The problematic region of D. melanogaster’s X chromosome contains about 5 million base pairs of DNA, while the same region of D. simulans’ X chromosome contains only about 100,000 base pairs, a 50-fold difference, said Ferree.

“It points to a species-specific difference in heterochromatin between these two species,” he added. “This could explain other instances when you have female hybrid lethality,” Ferree said.

The study was funded by the National Institutes of Health.

Prevents/inhibits/buffers against reproduction of closely related species. Thanks to “junk DNA” and a few other mechanisms, you can’t have children with sheep (poor aussies :p) or chimps…

But there’s one clump of junk DNA that so far is in no need of rethinking…

Human genome at ten: Life is complicated

The more biologists look, the more complexity there seems to be. Erika Check Hayden asks if there's a way to make life simpler....
[b]... Web-like networks[/b]

Biologists have seen promises of simplicity before. The regulation of gene expression, for example, seemed more or less solved 50 years ago. In 1961, French biologists François Jacob and Jacques Monod proposed the idea that ‘regulator’ proteins bind to DNA to control the expression of genes. Five years later, American biochemist Walter Gilbert confirmed this model by discovering the lac repressor protein, which binds to DNA to control lactose metabolism in Escherichia coli bacteria1. For the rest of the twentieth century, scientists expanded on the details of the model, but they were confident that they understood the basics. “The crux of regulation,” says the 1997 genetics textbook Genes VI (Oxford Univ. Press), “is that a regulator gene codes for a regulator protein that controls transcription by binding to particular site(s) on DNA.”

Just one decade of post-genome biology has exploded that view. Biology’s new glimpse at a universe of non-coding DNA — what used to be called ‘junk’ DNA — has been fascinating and befuddling. Researchers from an international collaborative project called the Encyclopedia of DNA Elements (ENCODE) showed that in a selected portion of the genome containing just a few per cent of protein-coding sequence, between 74% and 93% of DNA was transcribed into RNA2. Much non-coding DNA has a regulatory role; small RNAs of different varieties seem to control gene expression at the level of both DNA and RNA transcripts in ways that are still only beginning to become clear. “Just the sheer existence of these exotic regulators suggests that our understanding about the most basic things — such as how a cell turns on and off — is incredibly naive,” says Joshua Plotkin, a mathematical biologist at the University of Pennsylvania in Philadelphia.

Even for a single molecule, vast swathes of messy complexity arise. The protein p53, for example, was first discovered in 1979, and despite initially being misjudged as a cancer promoter, it soon gained notoriety as a tumour suppressor — a ‘guardian of the genome’ that stifles cancer growth by condemning genetically damaged cells to death. Few proteins have been studied more than p53, and it even commands its own meetings. Yet the p53 story has turned out to be immensely more complex than it seemed at first.

In 1990, several labs found that p53 binds directly to DNA to control transcription, supporting the traditional Jacob–Monod model of gene regulation. But as researchers broadened their understanding of gene regulation, they found more facets to p53. Just last year, Japanese researchers reported3 that p53 helps to process several varieties of small RNA that keep cell growth in check, revealing a mechanism by which the protein exerts its tumour-suppressing power.

Even before that, it was clear that p53 sat at the centre of a dynamic network of protein, chemical and genetic interactions. Researchers now know that p53 binds to thousands of sites in DNA, and some of these sites are thousands of base pairs away from any genes. It influences cell growth, death and structure and DNA repair. It also binds to numerous other proteins, which can modify its activity, and these protein–protein interactions can be tuned by the addition of chemical modifiers, such as phosphates and methyl groups. Through a process known as alternative splicing, p53 can take nine different forms, each of which has its own activities and chemical modifiers. Biologists are now realizing that p53 is also involved in processes beyond cancer, such as fertility and very early embryonic development. In fact, it seems wilfully ignorant to try to understand p53 on its own. Instead, biologists have shifted to studying the p53 network, as depicted in cartoons containing boxes, circles and arrows meant to symbolize its maze of interactions…

Nice article, have fun.

The language of RNA decoded: Study reveals new function for pseudogenes and noncoding RNAs+

The central dogma of molecular biology, as proposed in 1970 by Francis Crick and James Watson, holds that genetic information is transferred from DNA to functional proteins by way of messenger RNA (mRNA). This suggests that mRNA has but a single role, that being to encode for proteins.

Now, a cancer genetics team at Beth Israel Deaconess Medical Center (BIDMC) suggests there is much more to RNA than meets the eye…

A coding-independent function of gene and pseudogene mRNAs regulates tumour biology

The canonical role of messenger RNA (mRNA) is to deliver protein-coding information to sites of protein synthesis. However, given that microRNAs bind to RNAs, we hypothesized that RNAs could possess a regulatory role that relies on their ability to compete for microRNA binding, independently of their protein-coding function. As a model for the protein-coding-independent role of RNAs, we describe the functional relationship between the mRNAs produced by the PTEN tumour suppressor gene and its pseudogene PTENP1 and the critical consequences of this interaction. We find that PTENP1 is biologically active as it can regulate cellular levels of PTEN and exert a growth-suppressive role.Wealso show that the PTENP1 locus is selectively lost in human cancer. We extended our analysis to other cancer-related genes that possess pseudogenes, such as oncogenic KRAS. We also demonstrate that the transcripts of protein-coding genes such as PTEN are biologically active. [b]These findings attribute a novel biological role to expressed pseudogenes, as they can regulate coding gene expression, and reveal a non-coding function for mRNAs.[/b]

These results potentially have major implications for microarray data (and the interpretation thereof) that has so far been gathered :o .

Mystery RNA spawns gene-activating peptides

Short peptides that regulate fruitfly development are produced from 'junk' RNA. Some so-called 'non-coding' pieces of RNA may actually encode short proteins that regulate genes, researchers have found....

Junk DNA vs “shadow enhancers”:
Redundant Genetic Instructions in ‘Junk DNA’ Support Healthy Development

ScienceDaily (July 17, 2010) — Seemingly redundant portions of the fruit fly genome may not be so redundant after all. New findings from a Princeton-led team of researchers suggest that repeated instructional regions in the flies' DNA may contribute to normal development under less-than-ideal growth conditions by making sure that genes are turned on and off at the appropriate times. If similar regions are found in humans, they may hold important clues to understanding developmental disorders.
[b]The research results, published in the July 22 issue of the journal Nature, add to the growing body of evidence that so-called "junk DNA" is anything but rubbish. The term "junk DNA" is commonly used to describe the portion of the genome that doesn't contain genes, which are pieces of DNA that code for the production of proteins and other molecules that have specific functions. The noncoding region is often surprisingly large; in humans, some 98 percent of the genome merits "junk" status. But according to David Stern, a Princeton professor in the Department of Ecology and Evolutionary Biology, [u]scientists increasingly believe "junk DNA" is crucial for turning the information encoded in genes into useful products.[/u][/b]

“Over the past 10 to 20 years, research has shown that instructional regions outside the protein-coding region are important for regulating when genes are turned on and off,” said Stern, the senior scientist on the paper. “Now we’re finding that additional copies of these genetic instructions are important for maintaining stable gene function even in a variable environment, so that genes produce the right output for organisms to develop normally.”

Stern, along with Nicolás Frankel, a postdoctoral research fellow at Princeton, and their collaborators focused their attention on instructional regions called enhancers. These regions play an important role in the process by which information encoded in genes is used to direct the synthesis of the proteins that make an organism what it is – be it a fly, a mouse or a human.

“To interpret and fully understand the genome, we need to think of it from an ecological and evolutionary perspective,” Stern said. “Its purpose is to produce a healthy organism in a variable environment, so a good portion of it has evolved to deal with contingencies that organisms will experience in the real world.”

When enhancers were first discovered, scientists believed that they always were located in close proximity to the target genes that they regulate. Distances in DNA are measured in base pairs, which are the building blocks that make up the DNA molecule (for comparison, the entire fruit fly genome contains about 130 million base pairs, while the human genome has more than 3 billion). Until recently, enhancers were thought always to exist within about 1,000 base pairs from their target genes.

But in 2008, the University of California-Berkeley’s Michael Levine reported the discovery of secondary enhancers for a particular fruit fly gene that were located much farther away from the target genes and from the previously discovered enhancers that were located adjacent to the gene.

Levine’s team called the apparently redundant copies in distant genetic realms “shadow enhancers” and hypothesized that they might serve to make sure that genes are expressed normally, even if development is disturbed. Factors that might induce developmental disturbances include environmental conditions, such as extreme temperatures, and internal factors, such as mutations in other genes…

Follow the evidence…

More fascinating news about non-coding RNA:
Human Cells Can Copy Not Only DNA, but Also RNA

ScienceDaily (Aug. 10, 2010) — Single-molecule sequencing technology has detected and quantified novel small RNAs in human cells that represent entirely new classes of the gene-translating molecules, confirming a long-held but unproven hypothesis that mammalian cells are capable of synthesizing RNA by copying RNA molecules directly. The findings were reported in Nature by researchers from the University of Pittsburgh School of Medicine, Helicos Biosciences Corp., Integromics Inc., and the University of Geneva Medical School.
"For the first time, we have evidence to support the hypothesis that human cells have the widespread ability to copy RNA as well as DNA," said co-author Bino John, Ph.D., assistant professor, Department of Computational and Systems Biology, Pitt School of Medicine. "These findings emphasize the complexity of human RNA populations and suggest the important role for single-molecule sequencing for accurate and comprehensive genetic profiling."

Scientists had thought that all RNA in human cells was copied from the DNA template, Dr. John explained. The presence of mechanisms that copy RNA into RNA, typically associated with an enzyme called RNA-dependent RNA polymerase, has only been documented in plants and simple organisms, such as yeast, and implicated in regulation of crucial cellular processes. Since thousands of such RNAs have been detected in human cells and because these RNAs have never before been studied, further research could open up new fronts in therapeutics, particularly diagnostics, Dr. John said.

In the study, the researchers profiled small RNAs from human cells and tissues, uncovering several new classes of RNAs, including antisense termini-associated short RNAs, which are likely derived from messenger RNAs of protein-coding genes by yet uncharacterized, pervasive RNA-copying mechanisms in human cancer cell lines.

“This class of non-coding RNA molecules has been historically overlooked because available sequencing platforms often are unable to provide accurate detection and quantification,” said Patrice Milos, Ph.D., chief scientific officer at Helicos Biosciences. “Our technology provides the platform capability to identify and quantify these RNAs and reinforces the potential clinical advantages of our single molecule-sequencing platform.”

Co-authors include A. Paula Monaghan, Ph.D., and Sang Woo Kim, Ph.D., University of Pittsburgh School of Medicine; Sylvain Foissac, Ph.D., Integromics Inc.; Stylianos Antonarakis, M.D., Ph.D., and Christelle Borel, Ph.D., University of Geneva Medical School; and Philipp Kapranov, Ph.D., and others from Helicos BioSciences.

The research was funded by the American Cancer Society, the National Institutes of Health, the Swiss National Science Foundation, Integromics Inc., and Helicos BioSciences Corporation.

Gosh, I hope this class of non-coding RNAs was not neglected because some companies thought it might be junk and thus not worthwhile to develop the technology for these sequencing platforms?

More non-coding RNA:
‘Linc-ing’ a noncoding RNA to a central cellular pathway

The recent discovery of more than a thousand genes known as large intergenic non-coding RNAs (or "lincRNAs") opened up a new approach to understanding the function and organization of the genome. That surprising breakthrough is now made even more compelling with the finding that dozens of these lincRNAs are induced by p53 (the most commonly mutated gene in cancer), suggesting that this class of genes plays a critical role in cell development and regulation.
Furthermore, the researchers identify one lincRNA in particular (lincRNA-p21), and demonstrate its critical role in suppressing the reading of many genes across the genome following p53 activation. Led by investigators at Beth Israel Deaconess Medical Center (BIDMC) and the Broad Institute, the results are published in the August 6 issue of the journal Cell, which appears on-line today.

“We think that lincRNA-p21 may represent a new class of ‘tumor suppressor lincRNAs,’” said senior author John Rinn, PhD, Assistant Professor of Pathology at BIDMC and Harvard Medical School, and an Associate Member of the Broad Institute. “These findings may lead to the identification of novel biomarkers and targets for anti-cancer therapies, as well as add to our understanding of the mechanisms of gene regulation by lincRNAs.”

Since the central role of the p53 gene in cancer was first described more than 30 years ago, literally thousands of scientific publications have been published describing various aspects of its “tumor suppressor” role in regulating cell cycle and cell death (apoptosis) in response to DNA damage, by turning various relevant response genes on or off. However, the intermediary partners and mechanisms by which it carries out its function are still little understood. This current work demonstrates that several dozen lincRNAs are targeted directly by p53, and lincRNA-p21 in particular responds to p53 signaling by suppressing multiple genes across the genome to drive apoptosis.

“We were surprised to find that lincRNA-p21 appears to be functioning as a global repressor, regulating hundreds of genes in the p53 pathway,” said Maite Huarte, PhD, first and co-corresponding author.“This lincRNA is playing defense for p53 to block other pathways in their efforts to interfere with p53’s critical job of tumor suppression by cell death.”

lincRNA-p21 carries out this function by roping in other critical factors in the cell nucleus to assist in tamping down expression at specific genes. “In the same way that air traffic controllers organize planes in the air, lincRNAs organize key nuclear complexes in the cell,” said Rinn. “lincRNA-p21 specifically binds to a protein called hnRNP-K and then guides hnRNP-K to its final destination to shut down any genes that interfere with p53.”

As exciting as these findings are for understanding multiple forms of cancer, they have far broader implications for understanding basic genome biology and multiple diseases. “We know that so-called ‘transcription factors’ can turn genes on by recruiting transcriptional machinery, but it has been less clear how they turn genes off,” says Rinn. “lincRNAs could be those elusive ‘anti-factors’ that serve to shut genes down by reshuffling proteins around the genome.”

Provided by Beth Israel Deaconess Medical Center

More about lincs:
Long Noncoding RNA as Modular Scaffold of Histone Modification Complexes

And scientists find more microRNAs that control growth:
Scientists find gas pedal – and brake – for uncontrolled cell growth

About 8-10% of your (if you are human :P) genome consists of viral elements. And of course they played an important role in our evolution:

Ancient Viral Invasion Shaped Human Genome

ScienceDaily (Sep. 13, 2010) — Scientists at the Genome Institute of Singapore (GIS), a biomedical research institute of the Agency for Science, Technology and Research (A*STAR), and their colleagues from the National University of Singapore, Nanyang Technological University, Duke-NUS Graduate Medical School and Princeton University have recently discovered that viruses that 'invaded' the human genome millions of years ago have changed the way genes get turned on and off in human embryonic stem (ES) cells.

Scientists have discovered that viruses that “invaded” the human genome millions of years ago have changed the way genes get turned on and off in human embryonic stem cells. (Credit: iStockphoto/Martin McCarthy)

[b]The study provides definitive proof of a theory that was first proposed in the 1950s by Nobel Laureate in physiology and medicine, Barbara McClintock, who hypothesized that transposable elements, mobile pieces of the genetic material (DNA), such as viral sequences, could be "control elements" that affect gene regulation once inserted in the genome.[/b]

This finding is an important contribution to the advancement of stem cell research and to its potential for regenerative medicine. Led by GIS Senior Group Leader Dr Guillaume Bourque, the study was published in Nature Genetics on June 6, 2010.

Through the use of new sequencing technologies, the scientists studied the genomic locations of three regulatory proteins (OCT4, NANOG and CTCF) in human and mouse embryonic stem (ES) cells. Interestingly, while the scientists found a lot of similarities, they also found many differences in the methods and the types of genes that are being regulated in humans. In particular, it was discovered that specific types of viruses that inserted themselves in the human genomes millions of years ago have dramatically changed the gene regulatory network in human stem cells.

“This study is a computational and experimental tour de force. It provides undeniable evidence that some transposable elements, which are too often dismissed as merely junk DNA, are key components of a regulatory code underlying human development,” said Dr Cedric Feschotte, Associate Professor of the University of Texas Arlington.

The comparisons between the human and mouse model system in the study of gene regulatory networks help to advance the understanding of how stem cells differentiate into various cell types of the body. “This understanding is crucial in the improved development of regenerative medicine for diseases such as Parkinson’s disease and leukaemia,” said Dr Bourque. “Despite the advantages of using mouse ES cells in the study of gene regulatory networks, further research must focus more directly on human stem cells. This is due to the inherent challenges of converting the results of studies done from one species to that of the next. More research will need to be done in both human and non-human primate stem cells for findings on stem cells to be used in clinical application.”

Prof Raymond L. White, PhD, Rudi Schmid Distinguished Professor of Neurology, University of California said, “The paper reports very exciting new findings that establish a new and fundamentally distinct mechanism for the regulation of gene expression. By comparing the genomes of mouse with human, the scientists were able to show that the binding sites for gene regulatory factors are very often not in the same place between the two species. This by itself would be very surprising, but the investigators go further and demonstrate that many of the sites are imbedded within a class of DNA sequences called “transposable” elements because of their ability to move to new places in the genome. There are a number of such elements believed to be the evolutionary remnants of viral genomes, but it was very surprising to learn that they were carrying binding sites for regulatory elements to new locations. These changes in regulation would be expected to create major changes in the organisms which carry them. Indeed, many think that regulatory changes are at the heart of speciation and may have played a large role in the evolution of humans from their predecessors. This is likely to be a landmark paper in the field.”

Dr Eddy Rubin, Director of the U.S. Department of Energy Joint Genome Institute and Director of the Genomics Division at Lawrence Berkeley National Laboratory in Berkeley added, “This study using a comparative genomics strategy discovered important human specific properties of the regulatory network in human ES cells. This information is significant and should contribute to helping move the regenerative medicine field forward.”

As a practicing biological scientist over the last 30 years, I long ago stopped agonising over what was junk DNA, how it worked, or even how many genes we have - because, eventually, we will discover the answers to all of these things.

Consider: just 25-odd years ago, introns (intervening sequences within genes that do not make it into messenger RNAs) were discovered, and immediately thought to be junk. Turns out they are often anything but, and that introns significantly affect gene expression, and may even jump around the genome.

Consider: just ten or so years ago, it was accepted that the human genome had ~100 000 genes: we now know we have only ~40 000 - on a par with fruitflies, and fewer than the average protozoan.

However, as the good Teleo points out, vastly more of the genome gets transcribed than is translated - and the zoo of RNA molecule types just keeps growing. And their list of functions. Which may have a lot to do with how we can be as complex as we are as organisms, with fewer genes than beasts that restrict themselves to one kind of cell.

It’s not really surprising that the dividing line between “junk DNA” and “non-junk DNA” depends on how “functional DNA” is defined. In this study, they looked at how much of the human genome has remained the same over the last 100 million years of mammalian evolution — i.e., how much of our genome is essential for mammalian survival and procreation. It’s only about 8.2%.