Genome Toolkit. Part 3: building statistical data (k-mer frequency)

A k-mer is a contiguous sequence of k nucleotides (the building blocks of DNA) in a genome. Biologists often use k-mers to identify patterns or motifs in genomic sequences, such as repeated sequences or conserved regions. Let’s build an algorithm to do this.

Tips & Tricks: A faster search for patterns (loop, list, regexp)

In this article, we are going to take a look at one of the algorithms we wrote in Genome Toolkit series, Part 2, and attempt to optimize it.

Genome Toolkit. Part 2.1: Identifying, fixing, and testing a bug

In the previous article (Part 2 here), we wrote our first Genome Toolkit algorithm. Even though, it was a very simple algorithm to help us search for repeating patterns (k-mers) in a DNA/Genome sequences, and it seemed to worked correctly, we actually had a bug in it. Let’s take a look at what it is, and how we can fix it.

Book review: Bioinformatics with Python Cookbook, 3rd Edition.

Bioinformatics with Python Cookbook: Use modern Python libraries and applications to solve real-world computational biology problems, 3rd Edition.

Genome Toolkit. Part 2: in search of patterns

First function – counting patterns in a sequence.

Genome Toolkit. Part 1: project setup

Welcome to the new series, called “Genome Toolkit”. In this series, we will write a set of tools, that will help us find and build statistical data around any DNA, RNA and Protein sequences.

Getting started in Bioinformatics: A step-by-step guide.

A guide and advice on how to get started, or how to transition into Bioinformatics for people with biology or programming backgrounds.

Tips & Tricks: Hamming Distance

Let’s look at how we can program Hamming Distance algorithm in three different ways.

Bioinformatics Tools Programming in Python with Qt. Part 2.

DNA Engine project structure and class setup.

From Python to Rust: Part 3.

Python dictionary, Rust HashMap and a DNA Reverse Complement function.