Genome Toolkit Series

Genome Toolkit. Part 3: building statistical data (k-mer frequency)

December 29, 2022January 7, 2023 rebelCoder

A k-mer is a contiguous sequence of k nucleotides (the building blocks of DNA) in a genome. Biologists often use k-mers to identify patterns or motifs in genomic sequences, such as repeated sequences or conserved regions. Let’s build an algorithm to do this.

Genome Toolkit Series

Genome Toolkit. Part 2.1: Identifying, fixing, and testing a bug

December 11, 2022December 11, 2022 rebelCoder

In the previous article (Part 2 here), we wrote our first Genome Toolkit algorithm. Even though, it was a very simple algorithm to help us search for repeating patterns (k-mers) in a DNA/Genome sequences, and it seemed to worked correctly, we actually had a bug in it. Let’s take a look at what it is, and how we can fix it.

Genome Toolkit Series

Genome Toolkit. Part 2: in search of patterns

October 2, 2022December 1, 2022 rebelCoder

First function – counting patterns in a sequence.

Genome Toolkit Series

Genome Toolkit. Part 1: project setup

October 2, 2022October 15, 2022 rebelCoder

Welcome to the new series, called “Genome Toolkit”. In this series, we will write a set of tools, that will help us find and build statistical data around any DNA, RNA and Protein sequences.

Category: Genome Toolkit Series

Genome Toolkit. Part 3: building statistical data (k-mer frequency)

Genome Toolkit. Part 2.1: Identifying, fixing, and testing a bug

Genome Toolkit. Part 2: in search of patterns

Genome Toolkit. Part 1: project setup