Bioinformatics, Programming and Open-Source Science

Genome Toolkit. Part 3: building statistical data (k-mer frequency)

December 29, 2022January 7, 2023 rebelCoder

A k-mer is a contiguous sequence of k nucleotides (the building blocks of DNA) in a genome. Biologists often use k-mers to identify patterns or motifs in genomic sequences, such as repeated sequences or conserved regions. Let’s build an algorithm to do this.

Tips & Tricks

Tips & Tricks: A faster search for patterns (loop, list, regexp)

December 12, 2022December 12, 2022 rebelCoder

In this article, we are going to take a look at one of the algorithms we wrote in Genome Toolkit series, Part 2, and attempt to optimize it.

Genome Toolkit Series

Genome Toolkit. Part 2.1: Identifying, fixing, and testing a bug

December 11, 2022December 11, 2022 rebelCoder

In the previous article (Part 2 here), we wrote our first Genome Toolkit algorithm. Even though, it was a very simple algorithm to help us search for repeating patterns (k-mers) in a DNA/Genome sequences, and it seemed to worked correctly, we actually had a bug in it. Let’s take a look at what it is, and how we can fix it.

Podcast

Book review: Bioinformatics with Python Cookbook, 3rd Edition.

December 3, 2022December 12, 2022 rebelCoder

Bioinformatics with Python Cookbook: Use modern Python libraries and applications to solve real-world computational biology problems, 3rd Edition.

Genome Toolkit Series

Genome Toolkit. Part 2: in search of patterns

October 2, 2022December 1, 2022 rebelCoder

First function – counting patterns in a sequence.

Genome Toolkit Series

Genome Toolkit. Part 1: project setup

October 2, 2022October 15, 2022 rebelCoder

Welcome to the new series, called “Genome Toolkit”. In this series, we will write a set of tools, that will help us find and build statistical data around any DNA, RNA and Protein sequences.