i trying solve bioinformatics problems rosalind.info , locked out problem: http://rosalind.info/problems/mrna/
to solve have calculate number of different rna strings protein have been translated, modulo 1,000,000.
biological background: protein string composed of 20 amino acids represented 20 different letters. each amino acids can replaced more 1 rna string (composed 3 letter each 1).
this problem gets point of how manage large number when programming, usual case in bioinformatics. have tried different things inf or negative value doing bad.
the problems suggest should find way of manipulating large numbers without having store them. how possible? how can achieve php?
this best until now:
<?php function protein_reverse($sec) { $sec_arr = str_split($sec); $aa = array( 'f' => '2', 'l' => '6', 's' => '6', 'y' => '2', 'c' => '2', 'w' => '1', 'p' => '4', 'h' => '2', 'q' => '2', 'r' => '4', 'i' => '3', 'm' => '1', 't' => '4', 'n' => '2', 'k' => '2', 'v' => '4', 'a' => '4', 'd' => '2', 'e' => '2', 'g' => '4', ); $r = 1; foreach ( $sec_arr $base ) { $r *= $aa[$base] % 1000000; } return $r; } ?>
i have been able solve problem. first, says question @gavriel added in comments, have had use gmp library these big numbers operations. second, missing multiply per 3 @ end. necessary because if protein finished, there must termination codon (secuence).
/* reverse translation of protein @return number of possible rna strings modulo 1000000 */ function protein_reverse($sec) { $sec_arr = str_split($sec); $aa = array( 'f' => '2', 'l' => '6', 's' => '6', 'y' => '2', 'c' => '2', 'w' => '1', 'p' => '4', 'h' => '2', 'q' => '2', 'r' => '6', 'i' => '3', 'm' => '1', 't' => '4', 'n' => '2', 'k' => '2', 'v' => '4', 'a' => '4', 'd' => '2', 'e' => '2', 'g' => '4', ); $r = 1; foreach ( $sec_arr $base ) { $r = gmp_mul($r, $aa[$base]); } $r = gmp_mul($r, 3); $r = gmp_mod($r, 1000000); return $r; }
Comments
Post a Comment