A seminar to the Bioinformatics Group &
Department of Computer Science,
University of Wales Aberystwyth,
e.g.
Characters (bases, amino acids) are not all equal in all contexts, hence model in context (red):


r e a d o f f l i n e 
 
generic (global, 1state) DPA  (global, optimal) Malignment 

Generalizes to kstates, local & global, optimal & summed.
ROC curves often used in this area.
Want high coverage (few false ves) and
few errors (few false +ves),
Easiest problem, new method ~ shuffling.
 
green: PRSS pvalue
(blue: SW raw score); red: (summed) Malign (Markov=1). 
Uniform random pop'n (2 bits/base): All methods should do well, & they do. 
Easy problem, new method ~ shuffling
 
green: PRSS pvalue
(blue: SW raw score); 
0order data, biased composition: PRSS good, Malign best. 
Harder problem, new Malignment method (red) best
 
green: PRSS pvalue
(blue: SW raw score); red: (summed) Malign (Markov=1).  Mixed pop'n, high entropy 0order seq's and low entropy 0order seq's 
Hard problem,
new Malignment method (red & purple) best
 
green: PRSS pvalue;
(blue: SW raw score); red: (summed) Malign (Markov=1); purple: Malign (blended seq' model).  Pop'n of mixed seq's of high (2bit/base) & low entropy 1storder regions 
Reading:
[*] Like
many things out of Computer Science, Monash, this work owes a debt to