A k-mer is a contiguous sequence of k nucleotides (the building blocks of DNA) in a genome. Biologists often use k-mers to identify patterns or motifs in genomic sequences, such as repeated sequences or conserved regions. Let’s build an algorithm to do this.
In this article, we are going to take a look at one of the algorithms we wrote in Genome Toolkit series, Part 2, and attempt to optimize it.
In the previous article (Part 2 here), we wrote our first Genome Toolkit algorithm. Even though, it was a very simple algorithm to help us search for repeating patterns (k-mers) in a DNA/Genome sequences, and it seemed to worked correctly, we actually had a bug in it. Let’s take a look at what it is, and how we can fix it.
DNA Engine project structure and class setup.
Welcome back! Today we continue working on our DNA Toolkit project. In our last article, we created the first two functions: validate_seq and nucleotide_frequency. We
In this article we start our work on a DNA Toolkit. We write and test our first two functions, DNA Validation and Nucleotide Count functions.