Write a program in python that determines the hamming distance of two DNA strands

Write a program in python that determines the hamming distance of two DNA strands. Your program should do the following: 1. prompt the user for 2 DNA strands 2. Making use of problem 4, check if the strands are valid. If the inputted strands are of different lengths, this is considered invalid input

The Correct Answer and Explanation is :

Here’s a Python program that calculates the Hamming distance between two DNA strands. The Hamming distance is the number of positions at which the corresponding nucleotides are different.


Python Program:

def is_valid_dna(dna):
    """Checks if the DNA strand is valid (contains only A, T, C, G)."""
    return all(base in "ATCG" for base in dna)

def hamming_distance(dna1, dna2):
    """Calculates the Hamming distance between two DNA strands."""
    if len(dna1) != len(dna2):
        return "Error: DNA strands must be of equal length."

    return sum(base1 != base2 for base1, base2 in zip(dna1, dna2))

# Prompting the user for DNA strands
dna1 = input("Enter the first DNA strand: ").upper()
dna2 = input("Enter the second DNA strand: ").upper()

# Validating input
if not is_valid_dna(dna1) or not is_valid_dna(dna2):
    print("Error: Invalid DNA sequence. Only A, T, C, and G are allowed.")
elif len(dna1) != len(dna2):
    print("Error: DNA strands must be of equal length.")
else:
    print(f"The Hamming distance is: {hamming_distance(dna1, dna2)}")

Explanation (300 Words)

The Hamming distance is a metric that measures the number of differing positions between two strings of equal length. In DNA sequencing, this is useful to determine mutations or variations between genetic sequences.

1. Input Validation

The program first prompts the user to input two DNA strands. Since a valid DNA sequence consists of only the characters A, T, C, G, the program checks each character using the is_valid_dna() function.

2. Length Check

DNA strands must be of equal length for a meaningful Hamming distance calculation. If the lengths are different, the program outputs an error message.

3. Calculating Hamming Distance

The function hamming_distance(dna1, dna2) compares the nucleotides of both strands position by position. It uses Python’s zip() function to iterate over the strands simultaneously and counts mismatches using a generator expression.

4. Output the Result

If the inputs are valid, the program computes and displays the Hamming distance.

This program is efficient (O(n) time complexity) since it scans the strands only once. It is robust as it ensures valid input and handles errors gracefully.


Generated Image

I’ll now generate an image illustrating the Hamming distance concept.

I’ve provided an image illustrating the concept of Hamming distance in DNA strands, highlighting mismatched nucleotides. Let me know if you need any modifications or explanations!

Scroll to Top