Human Genome

The human genome is the genome of Homo sapiens, which is stored on 23 chromosome pairs. Twenty-two of these are autosomal chromosome pairs, while the remaining pair is sexdetermining. The haploid human genome occupies a total of just over 3 billion DNA base pairs. The haploid human genome contains ca. 23,000 protein-coding genes, far fewer than had been expected before its sequencing. In fact, only about 1.5% of the genome codes for proteins, while the rest consists of non-coding RNA genes, regulatory sequences, introns, and (controversially named) "junk" DNA The Human Genome Project (HGP) produced a reference sequence of the euchromatic human genome, which is used worldwide in biomedical sciences. Genes Surprisingly, the number of human genes seems to be less than a factor of two greater than that of many much simpler organisms, such as the roundworm and the fruit fly. However, human cells make extensive use of alternative splicing to produce several different proteins from a single gene, and the human proteome is thought to be much larger than those of the aforementioned organisms. Besides, most human genes have multiple exons, and human introns are frequently much longer than the flanking exon. Human genes are distributed unevenly across the chromosomes. Each chromosome contains various gene-rich and gene-poor regions, which seem to be correlated with chromosome bands and GC-content. The significance of these nonrandom patterns of gene density is not well understood. In addition to protein coding genes, the human genome contains thousands of RNA genes, including tRNA, ribosomal RNA, microRNA, and other non-coding RNA genes. Regulatory Sequences The human genome has many different regulatory sequences which are crucial to controlling gene expression. These are typically short sequences that appear near or within genes. A systematic understanding of these regulatory sequences and how they together act as a gene regulatory network is only beginning to emerge from computational, high-throughput expression and comparative genomics studies. Some types of non-coding DNA are genetic "switches" that do not encode proteins, but do regulate when and where genes are expressed. Identification of regulatory sequences relies in part on evolutionary conservation. The evolutionary branch between the human and mouse, for example, occurred 70–90 million years ago. So computer comparisons of gene sequences that identify conserved non-coding sequences will be an indication of their importance in duties such as gene regulation. Another comparative genomic approach to locating regulatory sequences in humans is the gene sequencing of the puffer fish. These vertebrates have essentially the same genes and regulatory gene sequences as humans, but with only one-eighth the "junk" DNA. The compact DNA sequence of the puffer fish makes it much easier to locate the regulatory genes. ory gene sequences as humans, but with only one-eighth the "junk" DNA. The compact DNA sequence of the puffer fish makes it much easier to locate the regulatory genes

and perhaps most. The investigation of the vast quantity of sequence information in the human genome whose function remains unknown is currently a major avenue of scientific inquiry. by some estimates 97%. if any. Author: Ms. There are. Different types of transposase work in different ways. Sujata Roy Saha Research Scholar Molecular Modeling & Drug Design
. Recent experiments using microarrays have revealed that a substantial fraction of non-genic DNA is in fact transcribed into RNA. rather than copy and paste. Certain classes of these sequences propagate themselves by RNA mediated transposition. the human genome contains vast regions of DNA the function of which. Also. there is also a large amount of sequence that does not fall under any known classification. cuts out the transposon and ligates it into the target site.5% of the human genome.]which leads to the possibility that the resulting transcripts may have some unknown function. Interspersed repetitive DNA is found in all eukaryotic genomes. a variety of emerging indications that many sequences within are likely to function in ways that are not fully understood. Transposons The major difference of class II transposons from retrotransposons is that their transposition mechanism does not involve an RNA intermediate. and these regions are sometimes collectively referred to as "junk" DNA. Transposase makes a staggered cut at the target site producing sticky ends. Much of this sequence may be an evolutionary artifact that serves no present-day purpose. however. functional elements in the genome remain unknown. of the human genome size. Class II transposons usually move by a mechanism analogous to cut and paste. Much of this is composed of: Repeat elements. These regions in fact comprise the vast majority. Aside from genes and known regulatory sequences. An example would be: A-T-T-C-G-A-T-T-C-G-A-T-T-C-G in which the sequence A-T-T-C-G is repeated three times. using the transposase enzyme. and the target site can therefore be anywhere.Other DNA Protein-coding sequences (specifically. the evolutionary conservation across the mammalian genomes of much more sequence than can be explained by protein-coding regions indicates that many. and they have been called retrotransposons. remains unknown. Tandem repeat: Tandem repeats occur in DNA when a pattern of two or more nucleotides is repeated and the repetitions are directly adjacent to each other. Junk DNA However. coding exons) comprise less than 1. Some can bind to any part of the DNA molecule. while others bind to specific sequences.