*************************************************************** * This is the main program to perform an algorithm for the * * subset selected procedure described in the paper: * * * * Mccabe, George P. (1984), Principal Variables, * * Technometrics, 26, 137-144 * * * * Department of Statistics * * Purdue University * * West Lafayette, IN 47907 * * * * SAS programmer---Zusheng Jin * * Department of Statistics * * Purdue University * * May 1994 * * * * Translate from the FORTRAN program by * * Programmer----Regina Becker * * Department of Statistics * * Purdue University * * June 1984 * *************************************************************** *************************************************************** * Input information: * * * * Variables: * * xmsum :the input correlation or covariance matrix * * to be analyzed. * * * * ip :information input code * * 1 if correlation matrix is input * * 2 if covariance matrix is input * * * * ix :type of analysis requested * * 0 if correlation analysis is requested * * 1 if covariance analysis is requested * * * * nbest :the number of best subset requested * *************************************************************** *************************************************************** * Output information: * * All the relevant results are in the listing file. * *************************************************************** *************************************************************** * The following subroutines are called in the program: * * 1. disc----from the file "disc.sas". This subroutine * * determines best subset and percent variance * * explained. * * * * 2. eigen---from SAS built-in function. This subroutine * * calculates eigenvalues and eigenvectors. * ***************************************************************; start pvar(xmsum, ip, ix, nbest); file print; indblk = {0, 1 , 4, 10, 20, 35, 56, 84, 120, 165, 220, 286, 364, 455, 560, 680, 816, 969, 1140, 1330}; indrow = {0,1,3,6,10,15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190}; nm={0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 }; xsum=nm; w = nm||nm||nm||nm||nm||nm||nm||nm||nm||nm; w = w || w || w|| w || w || w || w || w; x=nm; a = m||nm||nm||nm; np = ncol(xmsum); if np <=1 | np >= 20 then do; put "Error, Number must be between 2 and 20"; goto link80; end; if nbest < 1 | nbest > 20 then nbest =10; if ip < 0 | ip > 2 then; do; put "Error on parameter card --data input specification"; goto link80; end; if ix ^= 0 & ix ^=1 then; do; put "Error on parameter card ---analysis specification"; goto link80; end; put "Number of variables =" np; put "Number of best subset=" nbest; if ip= 0 then put "IP=0: data input"; if ip=1 then put "IP=1: input is correlation matrix"; if ip=2 then put "IP=2: input is covariance matrix"; if ix = 0 then put "Correlation matrix analysis requested"; if ix = 1 then put "Covariance matrix analysis requested"; ia = indblk[np]; if ip=2 & ix = 1 then goto link342; if ip=2 & ix = 0 then goto link322; if ip=1 then goto link342; link322: do i = 2 to np; do j = 1 to i-1; xmsum[i,j] = xmsum[i,j] / sqrt(xmsum[i,i] * xmsum[j,j]); end; end; do i = 1 to np; xmsum[i,i] = 1.0; end; link342: do i = 1 to np; do j = 1 to i; ibr = ia + indrow[i]; w[ibr+j] = xmsum[i,j]; end; end; put "The correlation matrix w ="; do i = 1 to np; ibr = ia + indrow[i]; do j = 1 to i; put (w[ibr+j]) ' ' @; end; put /; end; * Set xmsum to symmetric storage mode; do ii = 1 to np; do jj = ii+1 to np; xmsum[ii,jj] = xmsum[jj, ii]; end; end; * Calculate eigenvalue and eigenvectors. Currently the * *eigenvectors are not used; call eigen(d, z, xmsum); * d returns eigenvalues, largest to smallest; * z eigenvectors corresponding to d; * Calculate PCT variance explained by first k principal* *components; sum = sum(d); pcsum=0.0; do i = 1 to np; pcsum=pcsum+d[i]; pcvar=100* pcsum/sum; put "Pct var. explained by first" i " principal components =" pcvar; end; put " "; put " "; put " "; put " "; call disc(w, np, indblk, indrow, nbest); link80: finish pvar;