ACCELERATE

NAME
Accelerate vecLib vImage AltiVec vMathLib BLAS LAPACK vDSP vBigNum vBasicOps Vector Computation Velocity Engine Extended Math Library -This man page introduces the vector instruction set extension to the PowerPC architecture known as Velocity Engine (or AltiVec), the Accelerate umbrella framework, its constituent libraries and programming support in Mac OS X. DESCRIPTION
The PowerPC vector instruction set architecture is based on a separate SIMD style execution unit with inherently high data parallelism. This high degree of parallelism is enhanced with additional parallelism through superscalar dispatch to multiple execution units and execution unit pipelines. All vector instructions are designed to be easily pipelined with pipeline latencies no greater than the scalar double precision floating-point multiply-add fused class of instructions. There are no operating mode switches which preclude fine grain interleaving of instructions with the existing floating-point and integer instructions. Parallelism with the integer and floating-point instructions is simplified by the facts that the vector unit never generates an exception and has few shared resources or communication paths that require it to be tightly synchronized with the other units. Highlights Fixed vector length of 128-bits (16 8-bit elements, 8 16-bit elements, or 4 32-bit elements. Signed and unsigned 8-, 16-, and 32-bit integers, and IEEE single-precision floats. Saturation arithmetic. 32-register namespace. Vector register file architecturally separate from floating-point and integer registers. No mode switching that would increase the overhead of using the instructions. 4 operand, non-destructive instructions (3 source, 1 result). Operations selected based on utility to digital signal processing algorithms (including 2D and 3D image processing). Who benefits? Many of the services provided by MacOS X (e.g., Quartz, QuickTime, OpenGL, CoreAudio) already exploit the vector acceleration available on Macintosh G4 and G5 computers. All MacOS X users enjoy these benefits. Many applications that run on MacOS X (e.g., iTunes, iMovie) have already been coded to use the vector libraries and vector instruction set. Users of these applications enjoy the benefits of vector acceleration. Software developers who would like their code to use the vector facility on Macintosh G4 and G5 computers may choose to: (1) Make explicit calls to entry points in the Accelerate framework. Apple has optimized many of these routines for the vector engine (see the framework discussion that follows.) and/or (2) Program directly to the vector unit using the "Programming Interface Model." Note that a programmer must take explicit actions (as above) to engage the vector engine, otherwise it remains idle. Where to go from here: Browse a comprehensive introduction to vector programming: http://developer.apple.com/hardware/ve Examine the prototypes for functions you can invoke: /System/Library/Frameworks/vecLib.framework/Headers/*.h /System/Library/Frameworks/Accelerate.framework/Frameworks/vImage.framework/Headers/*.h Include the interfaces in the code you write: #include <Accelerate/Accelerate.h> Compile and link your code: cc -faltivec -framework Accelerate file.c Accelerate Umbrella Framework The Accelerate umbrella framework encompasses all the libraries provided with MacOS X that Apple has optimized for high performance vector and numerical computing. Subsequent sections describe the sub-frameworks that comprise the Accelerate framework. vImage Framework A collection of basic image processing filters such as Convolution, Morphological, and Geometric transforms. Alpha compositing and histogram operations are also supported. vecLib Framework The vecLib framework is a collection of facilities covering digital signal processing (vDSP), matrix computations (BLAS), numerical linear algebra (LAPACK), mathematical routines (vMathLib), basic operations (vBasicOps) and large number calculations (vBigNum). The vDSP, BLAS and LAPACK components of vecLib run on the scalar and vector domain. vecLib automatically detects the presence of the vector engine and uses it. vMathLib mirrors the existing scalar libm on the vector engine and vBasicOps is meant to complement the processor by providing more functionality such as a 32x32 vector integer multiply. vBigNum, vBasicOps and vMathLib run only on the vector engine. There is also another matrix computation package in vecLib called vBasicOps. It works somewhat in the same spirit as the BLAS. It is best suited for small problems when availability of source is preferred. It can also be used as an educational tool to gain insights into the working of the PowerPC vector unit. In most cases, the use of BLAS instead of vectorOps is recommended. vDSP The vDSP Library provides mathematical functions for applications such as speech, sound, audio, and video processing, diagnostic medical imaging, radar signal processing, seismic analysis, and scientific data processing. The vDSP functions operate on real and complex data types. The functions include data type conversions, fast Fourier transforms (FFTs), and vector-to-vector and vector-to-scalar operations. The vDSP functions have been implemented in two ways: as vectorized code (for single precision only), which uses the vector unit on the PowerPC G4 and G5 microprocessors, and as scalar code, which runs on Macintosh models that have a G3 microprocessor. It is noteworthy that vDSP's FFTs are one of the fastest implementations of the Discrete Fourier Transforms available anywhere. The vDSP Library itself is included as part of vecLib in Mac OS X. The header file, vDSP.h, defines data types used by the vDSP functions and symbols accepted as flag arguments to vDSP functions. vDSP functions are available in single and double precision. Note that only the single precision is vectorized due to the underlying instruction set architecture of the vector engine on board G4 and G5 processors. For more information about vDSP download the manual at <http://devel- oper.apple.co m/hardware/ve/downloads/vDSP.sit.hqx> BLAS
The Basic Linear Algebra Subroutines (BLAS) are high quality routines for performing basic vector and matrix operations. Level 1 BLAS consists of vector-vector operations, Level 2 BLAS consists of matrix-vector operations, and Level 3 BLAS have matrix-matrix operations. The efficiency, portability, and the wide adoption of the BLAS have made them commonplace in the development of high quality linear algebra software such as LAPACK and in other technologies requiring fast vector and matrix calculations. All the industry standard FORTRAN BLAS entry points and the standard C BLAS entry points are exported from the vecLib framework (the latter are commonly denoted the legacy C BLAS.) For more information refer to <http://www.netlib.org/blas/faq.html> LAPACK
LAPACK provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems. The associated matrix factorizations (LU, Cholesky, QR, SVD, Schur, generalized Schur) are also provided, as are related computations such as reordering of the Schur factorizations and estimating condition numbers. Dense and banded matrices are handled, but not general sparse matrices. In all areas, similar functionality is provided for real and complex matrices, in both single and double precision. LAPACK in vecLib makes full use of the optimized BLAS and fully benefits from their performance. All the industry standard FORTRAN LAPACK entry points are exported from the vecLib framework. C programs may make calls to the FORTRAN entry points using the prototypes set out in "/System/Library/Frameworks/vecLib.framework/Headers/clapack.h". For more information refer to <http://www.netlib.org/lapack/index.html>. Note that vecLib's LAPACK was built using the FORTRAN to C converter called f2c. Users must be aware that: ALL arguments must be passed by reference. This includes all scalar arguments such as matrix dimension M and N, further note there is a difference in the memory arrangement of a two-dimensional array in Fortran and C. For more information refer to <http://www.netlib.org/clapack/readme>. vBasicOps A collection of basic operations such as add, subtract, multiply and divide that complement the vector processor's basic operations up to 128 bits. Consult "/System/Library/Frameworks/vecLib.framework/Headers/vBasicOps.h" for further information. vBigNum Routines for large number calculations from 128 bits. Consult "/System/Library/Frameworks/vecLib.framework/Headers/vBigNum.h" for further information. Darwin June 6, 2002 Darwin

manual pages:

3 A B C D E F G H I L M N O P Q R S T U W X _
a b c d e f g h i j k l m n o p q r s t u v w x y z



www.osxterminal.com is a website by Andreas Wacker