Installation hints: The whole SAS package needs more than 500MB disk space. The part used in STAT514 takes about 300MB. To keep your SAS to "only" 300MB: 1). Ignore the CDs titled "Online Doc" and "Client-Side Components". 2). Say "No" when asked if you want to install help in "Simple HTML". 3). Choose Custom Installation and only choose the following components: Base SAS, Core of the SAS system, SAS/GRAPH, SAS/QC, and SAS/STAT.
Create or Open a SAS file:
After SAS is activated, you will see several widows. One is the
Editor in which you can create and modify SAS programs. In
today's lab, you will use sample SAS programs only. A sample
SAS file is given in the following. You can copy and
paste it from this webpage to
the SAS Editor window. Another way to open a saved SAS file is
either double-click on the sas file, or hightlight the Editor
window first, then click on File menu->Open->...
data Class; input Name $ Height Weight Age @@; datalines; Alfred 69.0 112.5 14 Alice 56.5 84.0 13 Barbara 65.3 98.0 13 Carol 62.8 102.5 14 Henry 63.5 102.5 14 James 57.3 83.0 12 Jane 59.8 84.5 12 Janet 62.5 112.5 15 Jeffrey 62.5 84.0 13 John 59.0 99.5 12 Joyce 51.3 50.5 11 Judy 64.3 90.0 14 Louise 56.3 77.0 12 Mary 66.5 112.0 15 Philip 72.0 150.0 16 Robert 64.8 128.0 12 Ronald 67.0 133.0 15 Thomas 57.5 85.0 11 William 66.5 112.0 15 ; symbol1 v=dot c=blue height=1.5pct; proc reg data=Class; model Weight = Height; run; plot Weight*Height/cframe=ligr; run; quit;Run a SAS file:
SAS Output:
The results appear in several other windows. The Log window is
a step-by-step account of what SAS did with your program. SAS reports
errors in your program here. Special graphics (plots) appear in a
separate Graph window with one graph per page. Use the Page Up
and Page Down keys to view the graphs one by one. The Output
contains the text output (the analytical results) from your
program.
If you make some changes in your SAS program and re-submit it. The new results will not replace the old results instead they will be appended to the old. It may cause some difficulty to see the new results. A simple way to solve this problem is to clean the windows before you submit the modified file. In the Log window, just right-click to bring up the contextual menus, then go to Edit->Clear All. For the Output and Graph windows, the most effective way is to go to the result summary window (left-most window), highlight the results main directory, then click on the X button or do Edit->Clear All.
Save/Print SAS Results:
You can highlight the window and do File
menu->Save/Print to save/print the contents there. SAS tends
to generate too many pages of output and it is better to move the
Output contents into a word processor like Microsoft Word. To
save the output window as .rtf file, highlighting the Output
window and select File menu->Save as->select save as
type RTF Files.
The graphics can also be cut and pasted into Word documents. Highlight the graphics window and go to the graphic of interest, click the Edit Graph button in the tool bar (or go to Tools menu-> Graphics Editor. Once in the graphics editor, you can add to or edit the graphic. To copy the graphic to Word, select Edit->Select->All and then Copy...You can also export the graphic as an image (.bmp, .gif, .jpeg, or .ps) and import them to word. In this case, you cannot edit them once in word.
options ls=75 ps=60 nocenter; goptions colors=(none) device=win target=winprtm rotate=landscape ftext=swiss hsize=8.0in vsize=6.0in htext=1.5 htitle=1.5 hpos=60 vpos=60 horigin=0.5in vorigin=0.5in; data one; infile 'c:\saswork\data\tensile.dat'; input percent strength time; title1 'example'; proc print data=one; run;Note very carefully that all SAS program lines end with a semicolon. The indented and blank lines just make the program easier to read. run tells SAS to execute the commands that proceed it. Note also that names in SAS should be no more than 8 characters long, should contain only letters and numbers, and should begin with a letter. These restrictions appear to be relaxed in more recent versions of SAS, I will still follow this rule.
options ls=75 ps=60 restrict the output to be 75 columns and 60 lines per page. The nocenter tells SAS not to center the output. goptions specifies various options of the graphics. These settings hopefully creates graphics that fit nice in Word. The colors=(none) option tell SAS to use black and white only.
title1 prints a title on each page of your output to help you identify it later. You should always do this. You can print more than one line by adding title2, title3, and so on. The actual title must be enclosed with a single right quote at each end of the text. The last title will be used on all subsequent graphs. To turn the last title off, you need the statement goptions reset=title.
data one: SAS programs usually consist of data steps and procedures. A data statement names a data set. The lines following a data statement creates the data set. This program has one data statement that creates a SAS data set called one containing three variables.
infile and input: we read data from a file. The infile statements tells SAS what file to read and where the file is located. Be sure to put a single right quote symbol on either end of the file's name. The input statement describes the data. We name the three variables percent, strength, and time. In this example, tensile.dat is an existent data set, SAS uses infile to read it into the SAS system. If you need input a new data set in SAS, the datalines statement should be used as demonstrated in the previous example.
proc: proc is the abbreviation of procedure. SAS/STAT consists of many procedures that provide a variety of functionalities for data management, analysis and visualization. The proc used in the above program is named print that prints the imported/created data to the Output window and you can verify if the data is correct. The general format of a procedure command is
Proc procname options; statement / statement options; statement / statement options; . .Now, the second block of tensile.sas is given as follows,
symbol1 v=circle i=none; title1 'Plot of Strength vs Percent Blend'; proc gplot data=one; plot strength*percent/frame; run;proc gplot makes a scatter plot. Note that the y (verticle) variable is given first. The symbol1 specifies the symbol to be used in the plot. The frame option puts a box around the plot.
proc boxplot; plot strength*percent/boxstyle=skeletal pctldef=4; run;proc boxplot creates boxplots of the data. Note that the y (verticle) variable is given first. The skeletal option means that the whiskers of each box extend to the minimum and maximum values. The pctldef option specifies certain way of computing quantiles.
proc glm; class percent; model strength=percent; output out=oneres p=pred r=res; run;proc glm (and proc mixed) are two linear model commands you will need for many of your homeworks. Please consult the SAS help for mroe details. We will discuss the procs/outputs further in class later on. The model statement has the form
The equal sign can be interpreted "is explained by". The output statement enable you to save results for further analysis. This creates a new file named oneres, which contains all the original data plus additional variables. Here the new variables are the predicted (p=pre) and residual (r=res) values.
proc sort; by pred; symbol1 v=circle i=sm50; title1 'Residual Plot'; proc gplot; plot res*pred/frame; run;proc sort sorts the data according to a specific variable(s). In this case, the data is sorted from smallest to largest according to the predicted values from the linear model. The plot statement generates a residual plot.
proc univariate data=oneres pctldef=4; var res; qqplot res / normal (L=1 mu=est sigma=est); histogram res / normal; run;proc univariate gives basic numerical descriptions for each variable you request. If you leave out the var statement, SAS describes all the numeric variables in the data set. Including the qqplot statement adds a normal quantile plot and including the histogram statement adds a histogram and overlays, in this case, a normal distribution. We will discuss these in some detail in class.
symbol1 v=circle i=none; title1 'Plot of residuals vs time'; proc gplot; plot res*time / vref=0 vaxis=-6 to 6 by 1; run; quit;This generates a residual versus time plot. To terminate all the commands of a sas program, you need add a quit statement in the end.