Statistical Bioinformatics

Homework 1

Question 1 (10pts)
a) What is probability of observing 61325 when rolling fair dice?

Don't use plagiarized sources. Get Your Custom Essay on
Statistical Bioinformatics
Just from $13/Page
Order Now

Probabilities for fair dice: P(1)=P(2)=P(3)=P(4)=P(5)=P(6)=1/6

b) What is probability of observing 61325 when rolling loaded dice?

Probabilities for fair dice: P(1)=P(2)=P(3)=P(4)=P(5)= 0.1 and P(6)=0.5

Question 2 (50pts). On a hypothetical island virus outbreak becomes a threat of future pandemic. Researchers have narrowed down the cause of outbreak to two viruses (virus 1 and virus 2). The DNA sequencing lab receives a sample for further analysis. Unfortunately, the sample was contaminated and the removal of foreign DNA leaves the lab with a short DNA fragment: AGTAGCTTCCAG. Given all available information (provided below) how can lab determine the type of the virus that caused the outbreak.

Nucleotide probabilities of virus1
P(A)=P(T) =0.3
P(G)=P(C) = 0.2
Nucleotide probabilities of virus 2
P(A)=P(T)=P(G)=P(C)= .25


– Virus 1 and Virus 2 are equally likely to occur in nature.
– nucleotides are independent and identically distributed.

Question 3 (40 pts)
Align two sequences shown below using Needelman Wunsch algorithm.
Use match score of 4, mismatch score of -4 and gap penalty score of -2.

a) dynamic programming matrix with scores (as it shown in Figure 6.1, Ewens or in Figure 2.5, Durbin which is available under Course Content.)
b) trace back pointers
c) alignment score

sequence 1:


sequence 2:

Looking for a Similar Assignment? Order a custom-written, plagiarism-free paper

WhatsApp Order Now