Hızlı Erişim


Bu Dergi DOI ve Crosscheck üyesidir


Özet


CLASSIFICATION OF SIGNALING PROTEINS USING COMPUTER-BASED METHODS
Cells send signals between them to start or end biological activities. These signals can be molecules like proteins, hormones. Receptors outside and inside of the cell capture these molecules and help the signal transduction process. There are 3 types of receptors that are on the surface of the cell: G-protein linked receptors, enzymic receptors and chemically gated ionchannels. The function of a protein depends on its structure. Signaling proteins have a crucial role in many biological activities such as functioning of the brain, tongue …etc. Signaling proteins are important for drug discovery. For these reasons it is important to define whether a protein is a signaling or not. Since experimental studies are very expensive and time consuming, machine learning techniques are used for this purpose. Aside from the machine learning techniques that are used the protein encoding scheme is also important for getting efficient results. Proteins are made of amino acids. Amino acids in a protein written next to each other is called an amino acid sequence. To use sequences in machine learning there are protein encoding schemes. Amino acid composition is one of them. Amino acid composition encodes amino acid frequencies as percentages but the sequence information is gone in the process. A solution for this was proposed by Kuo-Chen Chou in 2001 called pseudo amino acid composition. The aim of this study is to classify signaling proteins encoded by pseudo amino acid composition. Protein sequences in Fasta format were downloaded and encoded using pseudo amino acid composition to create a dataset. Training and test sets were separated by 75% and 25% respectively. Dataset had 1867 signaling and 3317 non-signaling proteins. Random forest, support vector machine and deep neural network models were applied to classify the signaling proteins. Random forest, support vector machine and deep neural network give the accuracy values as 0.749, 0.748 and 0.763, respectively. The AUROC values were similar to each other but Random Forest algorithm has the highest value of 0.745. The accuracy rate of random forest, support vector machine and artificial neural network algorithms is higher than 0.70, which shows that these models are effective in the classification of signal proteins encoded by pseudo amino acid composition.

Anahtar Kelimeler
PseAAC, support vector machine, neural network, random forest, signaling proteins

Kaynakça

Gelişmiş Arama


Duyurular

    ***********************

    DEĞERLİ BİLİM 

    İNSANLARI!

    mail mail mail mail mail

    Dergimizin Mayıs sayısı 

    (25.05.2020)

    yayınlanmıştır.

    mail mail mail mail mail

    DEĞERLİ BİLİM 

    İNSANLARI!

    mail mail mail mail mail

    Dergimizin

    Temmuz Sayısı 

    İçin Makalenizi  

    Sisteme Yükleyebilirsiniz.

    mail mail mail mail mail

     



Adres :Göztepe Mah., Beykoz, İstanbul/TURKEY
Telefon :+90 555 005 92 85 Faks :+90 216 606 32 75
Eposta :info@euroasiajournal.org

Web Yazılım & Programlama Han Yazılım Bilişim Hizmetleri