BLAS is an acronym for Basic Linear Algebra Subroutines. As the name indicates, it contains subprograms for basic operations on vectors and matrices. BLAS was designed to be used as a building block in other codes, for example LAPACK. The source code for BLAS is available through Netlib. However, many computer vendors will have a special version of BLAS tuned for maximal speed and efficiency on their computer. This is one of the main advantages of BLAS: The calling sequences are standardized so that programs that call BLAS will work on any computer that has BLAS installed. If you have a fast version of BLAS, you will also get high performance on all programs that call BLAS. Hence BLAS provides a simple and portable way to achieve high performance for calculations involving linear algebra. LAPACK is a higher-level package built on the same ideas.

Levels and naming conventions

The BLAS subroutines can be divided into three levels:

Each BLAS and LAPACK routine comes in several versions, one for each precision (data type). The first letter of the subprogram name indicates the precision used:

      S      Real single precision.
      D      Real double precision.
      C      Complex single precision.
      Z      Complex double precision.
Complex double precision is not strictly defined in Fortran 77, but most compilers will accept one of the following declarations:
      double complex list-of-variables
      complex*16     list-of-variables


Some of the BLAS 1 subprograms are: The first letter (x) can be any of the letters S,D,C,Z depending on the precision. A quick reference to BLAS 1 can be found at


Some of the BLAS 2 subprograms are: A detailed description of BLAS 2 can be found at


Some of the BLAS 3 subprograms are: The more advanced matrix operations, like solving a linear system of equations, are contained in LAPACK. A detailed description of BLAS 3 can be found at


Let us first look at a very simple BLAS routine, SSCAL. The call sequence is
      call SSCAL ( n, a, x, incx )
Here x is the vector, n is the length (number of elements in x we wish to use), and a is the scalar by which we want to multiply x. The last argument incx is the increment. Usually, incx=1 and the vector x corresponds directly to the one-dimensional Fortran array x. For incx>1 it specifies how many elements in the array we should "jump" between each element of the vector x. E.g. if incx=2 it means we should only scale every other element (note: the physical dimension of the array x should then be at least 2n-1). Consider these examples where x has been declared as real x(100).
      call SSCAL(100, a, x, 1)
      call SSCAL( 50, a, x(50), 1)
      call SSCAL( 50, a, x(2), 2)
The first line will scale all 100 elements of x by a. The next line will only scale the last 50 elements of x by a. The last line will scale all the even indices of x by a.

Observe that the array x will be overwritten by the new values. If you need to preserve a copy of the old x, you have to make a copy first e.g. by using SCOPY.

Now consider a more complicated example. Suppose you have two 2-dimensional arrays A and B, and you are asked to find the (i,j) entry of the product A*B. This is easily done by computing the inner product of row i from A and column j of B. We can use the BLAS 1 subroutine SDOT. The only difficulty is to figure out the correct indices and increments. The call sequence for SDOT is

      SDOT ( n, x, incx, y, incy )
Suppose the array declarations were
      real A(lda,lda)
      real B(ldb,ldb)
but in the program you know that the actual size of A is m*p and for B it is p*n. The i'th row of A starts at the element A(i,1). But since Fortran stores 2-dimensional arrays down columns, the next row element A(i,2) will be stored lda elements later in memory (since lda is the length of a column). Hence we set incx = lda. For the column in B there is no such problem, the elements are stored consecutively so incy = 1. The length of the inner product calculation is p. Hence the answer is
      SDOT ( p, A(i,1), lda, B(1,j), 1 )

How to get the BLAS

First of all you should check if you already have BLAS on your system. If not, you can find it on Netlib at


The BLAS routines are almost self-explanatory. Once you know which routine you need, fetch it and read the header section that explains the input and output parameters in detail. For example, here is the header of the BLAS 2 subroutine SGEMV:
     $                   BETA, Y, INCY )
*     .. Scalar Arguments ..
      REAL               ALPHA, BETA
      INTEGER            INCX, INCY, LDA, M, N
      CHARACTER*1        TRANS
*     .. Array Arguments ..
      REAL               A( LDA, * ), X( * ), Y( * )
*     ..
*  Purpose
*  =======
*  SGEMV  performs one of the matrix-vector operations
*     y := alpha*A*x + beta*y,   or   y := alpha*A'*x + beta*y,
*  where alpha and beta are scalars, x and y are vectors and A is an
*  m by n matrix.
*  Parameters
*  ==========
*           On entry, TRANS specifies the operation to be performed as
*           follows:
*              TRANS = 'N' or 'n'   y := alpha*A*x + beta*y.
*              TRANS = 'T' or 't'   y := alpha*A'*x + beta*y.
*              TRANS = 'C' or 'c'   y := alpha*A'*x + beta*y.
*           Unchanged on exit.
*  M      - INTEGER.
*           On entry, M specifies the number of rows of the matrix A.
*           M must be at least zero.
*           Unchanged on exit.
*  N      - INTEGER.
*           On entry, N specifies the number of columns of the matrix A.
*           N must be at least zero.
*           Unchanged on exit.
*  ALPHA  - REAL            .
*           On entry, ALPHA specifies the scalar alpha.
*           Unchanged on exit.
*  A      - REAL             array of DIMENSION ( LDA, n ).
*           Before entry, the leading m by n part of the array A must
*           contain the matrix of coefficients.
*           Unchanged on exit.
*  LDA    - INTEGER.
*           On entry, LDA specifies the first dimension of A as declared
*           in the calling (sub) program. LDA must be at least
*           max( 1, m ).
*           Unchanged on exit.
*  X      - REAL             array of DIMENSION at least
*           ( 1 + ( n - 1 )*abs( INCX ) ) when TRANS = 'N' or 'n'
*           and at least
*           ( 1 + ( m - 1 )*abs( INCX ) ) otherwise.
*           Before entry, the incremented array X must contain the
*           vector x.
*           Unchanged on exit.
*           On entry, INCX specifies the increment for the elements of
*           X. INCX must not be zero.
*           Unchanged on exit.
*  BETA   - REAL            .
*           On entry, BETA specifies the scalar beta. When BETA is
*           supplied as zero then Y need not be set on input.
*           Unchanged on exit.
*  Y      - REAL             array of DIMENSION at least
*           ( 1 + ( m - 1 )*abs( INCY ) ) when TRANS = 'N' or 'n'
*           and at least
*           ( 1 + ( n - 1 )*abs( INCY ) ) otherwise.
*           Before entry with BETA non-zero, the incremented array Y
*           must contain the vector y. On exit, Y is overwritten by the
*           updated vector y.
*           On entry, INCY specifies the increment for the elements of
*           Y. INCY must not be zero.
*           Unchanged on exit.
*  Level 2 Blas routine.
*  -- Written on 22-October-1986.
*     Jack Dongarra, Argonne National Lab.
*     Jeremy Du Croz, Nag Central Office.
*     Sven Hammarling, Nag Central Office.
*     Richard Hanson, Sandia National Labs.

[Fortran Tutorial Home]