Introduction to Splus
In this lab, we will learn data input/output, splus graphcis and
basic data analysis funcitons. The tutorial material written by
B. Venables and D. Smith is highly recommended.
Splus is an object-oriented programming and statistical analysis language.
Data in splus is aggregated into "objects" of different types. The most
common objects types are vectors, matrices, data frames and lists. Other
object types include time series, arrays, factors, etc. Most work in splus
is accomplished by functions, which typically create new objects from
existing ones. Splus is case sensitive.
Create a directory called _Data to store your objects:
For h: drive
mkdir h:\Stat524
mkdir h:\Stat524\_Data
For a: drive
mkdir a:\Stat524
mkdir a:\Stat524\_Data
Now create directories to store ordinary data sets
mkdir a:\Dataset
mkdir h:\Dataset
Create a simple data set called temp.txt using microsoft word, save
it to either h:\Dataset or a:\Dataset in text format.
x1 x2 x3
1 2 3
2 3 4
4 5 6
5 6 7
6 7 8
Down load a data set from http://www.stat.purdue.edu/~yuzhu/stat524/Datasets/gala.data. Save it to either h:\Dataset or a:\Dataset.
Goto standard software\statistical packages\s-plus 4.0, and clik on it.
If there is a popup windown asking for directory setup, jsut clik on okay
or cancel. And probablly you need try to open s-plus 4.0 again. Though
windows version of splus provides user friendly interface, we will
focus on the language itself and primarily use the command windows.
Your first job every time you start is to tell splus where to store
your objects. In lab, we will use h: drive because it seems to be fast.
>attach("h:\\Stat524\\_Data", pos=1)
>search()
search() gives the splus search path list. "h:\Stat524\_Data" occupies the
first position to which splus writes new objects.
Input data sets into splus. We have already created two data sets in
h:\Dataset, which are temp.txt and gala.data.
>temp_read.table("h:\\Dataset\\temp.txt", header=T, row.names=NULL)
>gala_read.table("h:\\Dataset\\gala.data",header=T)
>temp
>gala
There are two ways to get each variable from a data set. If all the
analyses will be about gala
>attach(gala)
>Species
>Area
>search()
>detach(2)
>Species
Or
>gala$Species
>gala$Sp
>gala$Area
>gala$Ar
Figure out what each of these commands do by experimentation.
>summary(gala)
>mean(gala$Species)
>min(gala$Sp)
>range(gala$Sp)
>var(gala$Sp)
>median(gala$Sp)
>stem(gala$Sp)
>gala$Endemics/gala$Species
>ratio_gala$Endemics/gala$Species
>summary(ratio)
>larea_log(gala$Area)
>larea
>var(gala)
>cor(gala[1:7])
>cor(gala[,-(1:2)])
You can copy-paste the output to miscrosoft word incorprated into your
reports.
Splus has very strong graphical capabilities. To create a plot, open a
a graphics window:
>win.graph()
>hist(gala$Sp)
You can save, print, and copy-paste a graph. If there are only
one graphics window opened, a new graph will replace the old one. You
can open more than one graphics widnows. Some time, you want to have
several plots on one page, so you need partition the graphics window
first:
>par(mfrow=c(2,2))
>hist(gala$Sp)
>hist(gala$Area)
>hist(gala$Elevation)
>hist(gala$Scruz)
Reture to default format
>par(mfrow=c(1,1))
Now see if you can figure out what these commands do:
>boxplot(gala$Sp)
>plot(gala$Sp,gala$En)
>plot(gala$Sp,gala$En,xlab="Species",ylab="Endemics",main="Title here")
>plot(gala$Sp,gala$En,xlab="Species",ylab="Endemics",main="Title here",type
="n")
>text(gala$Sp,gala$En,row.names(gala))
>pairs(gala)
>par(mfrow=c(2,3))
>for(i in 2:7) boxplot(gala[,i])
>plot(gala)
While running splus you can get help about a particular commands. For
example, you want to know what the command pairs is, you can type
>help(pairs)
If you do not know the name of the command you want to use, you can type
>help()
and then browse. You are encoraged to explore splus further based on
the online materials or the books you bought.
Before quitting, it is usually a good idea to clean up your workplace,
to keep it from being cluttered up with objects you will not use again.
To view the names of the objects in the workspace, type
>objects()
To remove unwanted ohjucts, such as the data temp, simply type
>rm(temp)
>objects()
Now, you can quit splus by closing splus window, or by entering
>q()