Title: Efficient pattern matching on very large strings using the Biostrings package Author: Hervé Pagès Biostrings is an R package that provides an efficient infrastructure for searching for patterns in strings of hundreds of millions of letters. The package implements the BString class that allows a single string to be stored in a way similar to the raw type (byte array) but with the important difference that the data are not copied on object duplication or substring extraction. The matchPattern function implements fast algorithms for matching patterns against a BString object. The BStringViews class allows compact storage of a set of views on the same BString object, typically the matches returned by the matchPattern function. In addition to the general purpose BString and BStringViews classes, the package also provides biology-oriented BString subclasses like DNAString, for storing a DNA sequence, or AAString, for storing a sequence of amino acids. We discuss the implementation of the BString class and compare R's standard pattern matching tools to those provided by Biostrings by searching for thousands of short patterns in the fly genome.