Biotite: a unifying open source computational biology framework in Python

BMC Bioinformatics. 2018 Oct 1;19(1):346. doi: 10.1186/s12859-018-2367-z.

Abstract

Background: As molecular biology is creating an increasing amount of sequence and structure data, the multitude of software to analyze this data is also rising. Most of the programs are made for a specific task, hence the user often needs to combine multiple programs in order to reach a goal. This can make the data processing unhandy, inflexible and even inefficient due to an overhead of read/write operations. Therefore, it is crucial to have a comprehensive, accessible and efficient computational biology framework in a scripting language to overcome these limitations.

Results: We have developed the Python package Biotite: a general computational biology framework, that represents sequence and structure data based on NumPyndarrays. Furthermore the package contains seamless interfaces to biological databases and external software. The source code is freely accessible at https://github.com/biotite-dev/biotite .

Conclusions: Biotite is unifying in two ways: At first it bundles popular tasks in sequence analysis and structural bioinformatics in a consistently structured package. Secondly it adresses two groups of users: novice programmers get an easy access to Biotite due to its simplicity and the comprehensive documentation. On the other hand, advanced users can profit from its high performance and extensibility. They can implement their algorithms upon Biotite, so they can skip writing code for general functionality (like file parsers) and can focus on what their software makes unique.

Keywords: NumPy; Open source; Python; Sequence analysis; Structural biology.

MeSH terms

  • Aluminum Silicates*
  • Computational Biology / methods*
  • Ferrous Compounds*
  • Programming Languages
  • Software

Substances

  • Aluminum Silicates
  • Ferrous Compounds
  • biotite