AltiVec and Open Source


Subject: AltiVec and Open Source
From: Jason Titus (jason@iatlas.com)
Date: Thu Sep 16 1999 - 15:46:41 MDT


Hi... I've been spending a fair amount of time lately looking at AltiVec and
trying to figure out ways to maximize its usefulness in Linux and databases
(MySQL in particular). Seems like there are a number of ways of using it to
speed up memory copies, character comparisions, string length calculations,
etc. I haven't found much, and so started looking to see what the x86 folks
have been up to w/ MMX, 3dNow! and SSE. Unfortunately, the answer seems to
be - not much. But I have found a couple good things I thought might be of
interest to Linux PPC folks -

-----

A C like language for dealing with SIMD instructions across multiple
implementations (MMX, 3DNow!, VSI (?), etc)

http://shay.ecn.purdue.edu/~swar/Swarc/Index.html

------

An AltiVec version of strlen -

Subject: Re: Multiprocessing capabilites of upcoming PPC processors
Date: 1998/06/05
Author: Alex Rosenberg <alexr@I.HATE.SPAM>
  Posting History In article <6l7h8t$pvm$1@andros.cygnus.com>,
billm@cygnus.com (Bill Moyer) wrote:
 
>  AltiVec's 128-bit byte compare operation is considerably more powerful for
>string manipulations than Alpha's 64-bit byte compare operation, even though
>the AV software has to do a binary-search on the last word of the operation
>in order to find the index to the byte that triggers the end of the loop,
>while the Alpha generates a bitmask which can be used to jump into a table
>which loads the right value into a register without all the extra branching. 
 
What follows is a version of strlen written to use AltiVec. It doesn't
perform a binary search to find the zero element, but rather produces a
bitmask indicating which elements are zero. This bitmask is then used with
the cntlzw instruction to find the first set bit from the left. It's several
cycles sorter than the equivalent binary search tricks for AltiVec. This
trick was suggested by Keith Diefendorff.
 
size_t vec_strlen(const char *s)
{
  int count = 0;
  const unsigned char *t = (const unsigned char *) s;
 
  while ((unsigned long) t & (vec_step(vector unsigned char)-1))   {
      if (*t++ != 0)
        count++;
      else
        return count;
  }
 
  {
      vector unsigned char *v = (vector unsigned char *) t;
      vector unsigned char buf = *v++;
      vector unsigned char zeros = (vector unsigned char)(0);
 
      while (vec_all_ne(buf, zeros))
      {
        buf = *v++;
        count += vec_step(vector unsigned char);
      }
 
      {
        vector unsigned char bit_encode = (vector unsigned
char)(0x80,0x40,0x20,0x10,0x08,0x04,0x02,0x01,0x80,0x40,0x20,0x10,0x08,0x04,
0x02,0x01);         vector unsigned long shift_constant = (vector unsigned
long)(8, 8, 0, 0);
        vector unsigned char work;
        work = vec_and(vec_cmpeq(buf, zeros), bit_encode);
        work = (vector unsigned char) vec_sum4s(work, (vector unsigned long)
zeros);
        work = (vector unsigned char) vec_sl((vector unsigned long) work,
shift_constant);
        work = (vector unsigned char) vec_sums((vector signed long) work,
(vector signed long) zeros);
        count += __cntlzw(((unsigned long *) &work)[3] << 16);       }
  }
 
  return count;
}

-----

I would love to hear any news on what people are planning for Open Source
AltiVec support - I am planning on getting a G4 ASAP to start doing some
work on these ideas. I am more of a scripter
and debugger then coder, but let me know if I can be of any help!

Jason Titus
jason@iatlas.com



This archive was generated by hypermail 2a24 : Fri Oct 01 1999 - 16:13:44 MDT