Bioinformatics, including operation of biometric data, including DNA sequence data, the search and retrieval. Progress of the application, the development of technology to store and retrieve the DNA sequence has led to database theory string search algorithm, and machine learning, especially widely in computer science. Matching algorithms or string search to determine the occurrence of a sequence of characters in a larger array of a character, have been developed to find the sequence of nucleotides. DNA sequences can be aligned with other DNA sequences to identify homologous sequences, to find the specific mutation to be different them. Such techniques have been used to study protein function and phylogeny, the multiple sequence alignment in particular. The data sets, for example, DNA sequences such as those of the human genome project derived from an entire genome “substantial, be used without annotations defining the position of the regulatory elements of the chromosomal gene and is difficult. It is most The characteristic pattern associated with the gene encoding the RNA region or protein DNA sequence, even before they have been separated experimentally, the researchers of the possible functions and their particular gene product in the body . target genome can be identified by gene finding algorithm that can predict the presence, it is possible to compare it is possible to allow the examination of event complex evolution and shed light on the evolutionary history of a particular organism I can.


In biology save, retrieve, organize, and bioinformatics is an area interdisciplinary which has improved the way that evolved for analyzing biological data. Major activities in bioinformatics is to develop software tools to produce useful biological knowledge. Bioinformatics has become an important part of many areas of biology. Signal processing, to enable the derivation of useful results raw data, from large amount of the molecular biology techniques experimental and images bioinformatics like this. In the field of genomics and genetics, I will help to annotate and sequencing the mutations and the observed genome. It plays a role in the development of gene ontology and biological and text mining biological literature for organizing data and applications biological. It plays a role in the analysis of regulatory proteins and genes and expression. Help on the comparison of genome data and genetics, bioinformatics tools to the understanding of aspects of the evolution of molecular biology more generally. The level integrated more, it helps to analyze the network is an important part of systems biology and biological pathway, to catalog. In structural biology, I will help in modeling and simulation of molecular interactions and the structure of the protein DNA, and RNA.

Bioinformatics uses many areas of Computer Science to handle biological data, mathematics, and engineering. A complex machine is used for reading of biometric data at a higher speed than before. And information system database is used to store biometric data and to organize. It may include artificial intelligence to analyze the biometric data, Soft Computing, data mining, image processing algorithm simulation. Algorithm depends on the theoretical basis theory discrete mathematics, management, systems theory, information theory, and statistics in order. And including the technology of Perl, software commonly used tools, Java, C #, XML, spreadsheet application C, Python and C + +, R, SQL, CUDA, and MATLAB.


Based on the recognition of the importance of information transmission, collection and processing of biological systems in 1970 Paulien Hog ​​eweg, coined the term “bioinformatics” to refer to the study of information processes in biological systems. [4] [5] (biochemistry is the study of chemical processes in biological systems) is set as the biochemistry [6] and bioinformatics parallel biophysics, this definition.  Examples of information learned important biological processes, in the early days of bioinformatics, mutual and simple rules of maintenance model of the evolution of prebiotic complex social structure, behavior, and the accumulation of information is a form of action.

Participants early bioinformatics is Elvin A. In the overall volume of antibody sequences between Tai Te Wu of 1991 and 1980 has been announced, development Kabat sequence analysis biological in 1970,. At the beginning of the “genomic revolution” “father. Mother and bioinformatics” important pioneer Margaret Oakley Day Hof, another that is considered by the Director of the National Center for example, David Lipman, of biotechnology information,, bioinformatics, the term has been found relating to create and maintain a database for storing the biometric information of amino acid sequences such as nucleotide and such. As well as design issues, the development of this type of database is related but, by the same way that it was able to researchers, access to existing data, and provides data that are new or revised, development of complex interfaces.

The Φ-X174 phages, since sequenced in 1977, decoded, DNA sequences of organisms thousands had been stored in the database. Polypeptide (protein) RNA genes, regulatory sequences, structural motifs, and this sequence information is analyzed to determine a gene encoding a repetitive sequence. A comparison of genes between species or different species, the function of the protein, or you can show the similarity between the relationship between (using the phylogenetic tree of the building of the molecular classification) species. An increase of the data volume, it was impractical to analyze long DNA sequences manually. Today, for example, a computer program such as BLAST are routinely used to search consisting of nucleotides 19 billion or more, the sequence of the organisms more than 260 000.


It is related to identify sequences not identical, these programs may be (base insertion or deletion replacement) to compensate for mutations in the DNA sequence. The variant of this arrangement, I was used for sequence alignment process itself. Skill shotgun sequence mentioned (for example, Genome Institute, have been used in the sequence of the first bacterial genome, of Haemophilus influenzae) and it does not generate a whole chromosome. Instead, it generates a sequence of thousands at (900 nucleotides from the range 35 of a sequence according to art) small pieces of DNA. Ends of these fragments and overlap, when aligned with a suitable genome assembly program can be used to restore the complete genome. There is that the array of shotgun sequence data yield quick work of assembling the fragments can be complicated for a fairly large genome. If the genome is larger than the human genome, usually, must be filled in later, to pick up the resulting assembly and work, including a gap of many, many days of CPU time-consuming in a multi-processor computer high-capacity memory it There is a case. In the method of choice for genome assembly algorithms and genomes sequenced today virtually all, shotgun sequencing is an important area of ​​bioinformatics research.