Biological Repository (BioR) for Genomic Annotation¶
Overview:¶
BioR is light-weight annotation distribution infrastructure. It is a general purpose genomic data integration tool that enables genomic coordinate based or accession lookup searches. This user guide will help get you up to speed in how to use BioR. BioR utilizes catalogs, which are the data files containing the data of interest. Please see Installation Guide to get this tool for yourself.
Catalogs:¶
A BioR catalog is basically a BED-JSON hybrid file that is
indexed using Tabix for coordinate based search and BioR’s own indexing system for string matching based searches.
Toolkit:¶
BioR uses a Pipe-And-Filter architecture. Data to be annotated by BioR is streamed through a pipeline, a sequence of one or more pipes. Pipes is based on Flow Based Programming by J.P. Morrison. DataFlow-Article, Flow-Based-Programing.
Figure 1: BioRTools works by adding annotation to the right on the original file.
BioR leverages UNIX pipes to flow data from program to program. As BioR programs work on the data, they place annotation to the right (the red, blue and green columns in Figure 1).