Spring, 2007
Monday and Wednesday, 4:30 pm - 5:45 pm.
Science and Technology I, room 122
Instructor:
James Gentle
TA: Ruihua Xu
Office hours: Mondays 6:00pm--7:00pm; Fridays 10:00am--11:00am.
Room 146, Science & Technology II.
If you send email to the instructor
or to the TA ,
please put "Stat 362" or "IT 362" in the subject line.
This web page and the links to notes
will evolve as the semester progresses. In
particular, links to solutions to assignments and quizzes will be posted
at appropriate times (after the fact, of course!).
The general description of the course is available at
www.scs.gmu.edu/~jgentle/stat362/
The references below refer to
- Ronald P. Cody and Jeffrey K. Smith (2006),
Applied Statistics and the SAS Programming Language, Fifth Edition,
Pearson Prentice Hall, Upper Saddle River, NJ.
- Lora D. Delwiche and Susan J. Slaughter (2003),
The Little SAS Book, Third Edition, SAS Institute Inc., Cary, NC.
Note the websites for the books, where sample programs and datasets are
available:
The final exam is May 9 at 4:30.
Week 1: Jan 22, 24
- Computer systems and organization; applications software.
- Data organization for statistical analysis.
The "standard" rectangular array
- Observations; rows ("cases", "instances", "records", etc.)
- Variables; columns ("features", "attributes", "fields", etc.)
- Introduction to the SAS software system
References: Applied Statistics , Chapter 1, and
The Little SAS Book , Chapter 1.
- Starting SAS
- The SAS DATA step
- Defining variables and entering data
- Types of data; numeric data, character data
- Creating new variables; the assignment statement
- Simple SAS PROCs: PRINT, MEANS, FREQ, CORR, and SORT
- Assignment 1: Exercises 1.2, 1.4, and 1.8 in Applied Statistics
(due Monday Jan 29).
- Solution
Week 2: Jan 29, 31
- More on the SAS DATA step
- Missing values
- More on the INPUT statement
- Input from an ASCII data file
(Reference:
The Little SAS Book , Chapter 2).
- DROP/KEEP
- Input from another SAS dataset; the SET statement
- Conditional execution
(reference: The Little SAS Book , Sections 3.4 and 3.5)
- Review of basic statistics
(Reference: Applied Statistics , Chapter 2.)
- Descriptive statistics
- Populations and parameters; Samples and statistics:
- Statistical inference
- Hypothesis testing (t tests); one-sided tests, two-sided tests
- Estimation: confidence intervals
- qq plots
- More about PROCs
- PROC SORT (reference: The Little SAS Book , Section
4.3).
- Sortkeys
- The BY statement
- The WHERE statement
(reference: The Little SAS Book , pages 102, 103, 306, and 307)
- PROC MEANS and PROC UNIVARIATE (reference: The Little SAS Book , Sections
4.9, 8.1, and 8.2).
- SAS statements
- t tests in SAS
- Assignment 2: Exercises 2.2, 2.4, and 2.8 in Applied Statistics
(due Monday Feb 5).
- Solution
Week 3: Feb 5, 7
- More on SAS datasets
(reference: The Little SAS Book , Sections 2.19-2.22 and
Sections 3.1-3.3).
- External files
- Saving and reusing SAS datasets
- Additional statements in the SAS DATA step:
- SAS functions (reference: Applied Statistics , Chapters 17)
- Simple functions (SQRT, LOG, RANUNI, etc.)
- Functions to compute statistics within observations
- LABEL, FORMAT
- OUTPUT/DELETE, DROP/KEEP, RENAME
- Multiple datasets within one DATA step
- Working with subsets of a dataset; OBS, FIRSTOBS
- Making programs work correctly; reading assignment:
The Little SAS Book , Chapter 10
- Assignment 3 (due Monday Feb 12).
- Solution
Week 4: Feb 12, 14
Week 5: Feb 19, 21
- Monday, Feb 19
- Wednesday, Feb 21
- Review of Quiz 1. Version a .
Version b . Version c .
- More on concatenation, interleaving, merging, amd updating of SAS datasets.
- PROC FORMAT; custom formats, recoding
(references: The Little SAS Book , Section 4.7,
and Applied Statistics , Chapter 3).
- Still more on the DATA step
(reference: The Little SAS Book , Chapters 2, 3, and 4).
- Lagged variables
- Automatic variables
- Arrays (reference: Applied Statistics , Chapters 15)
- DO loops (reference: Applied Statistics , page 419)
- Labels
- More on the INPUT statement and controlling output
- Character data
- PUT and FILE statements
- Assignment 5 (due Monday Feb 26).
- Solution
Week 6: Feb 26, 28
Week 7: Mar 5, 7
- More on the analysis of categorical data; interpretation of SAS output.
(Reference: Applied Statistics , Chapter 3.)
- Are two categorical variables related (or are they "independent")?
Chi-squared.
- Unmatched case-control studies. Chi-squared and odds ratio.
- Matched case-control studies. McNemar test.
- Agreement. Kappa.
- Linear trend. Mantel-Haenszel chi-squared.
- Assignment 7: Exercises 3.9, 3.10, 3.11, and 3.15 in Applied Statistics
(due Monday Mar 19).
- Solution
No class Mar 12, 14; GMU Spring Break
Week 8: Mar 19, 21
- Review: manipulations of SAS datasets in the DATA step; PROC FORMAT;
categorical data.
- Wednesday, Mar 21 Quiz 2
Week 9: Mar 26, 28
Week 10: Apr 2, 4
(Reference: Applied Statistics , Chapter 5)
Week 11: Apr 9, 11
(Reference: Applied Statistics , Chapter 9)
Week 12: Apr 16, 17
(Reference: Applied Statistics , Chapter 9)
- Regression models with categorical covariates; "dummy" variables.
- Assignment 11: Exercises 9.1 and 9.2
in Applied Statistics
(due Monday Apr 23).
Week 13: Apr 23, 25
- Monday, Apr 23
- Quiz 3
(regression analysis).
- Wednesday, Apr 25
- More on SAS datasets:
- the pointer in the INPUT statement
- informats/formats
- date data (reference: The Little SAS Book , Sections 3.7, 3.8)
- SQL (reference: The Little SAS Book , Appendix F)
- importing and exporting SAS datasets (reference: The Little SAS Book ,
Sections 2.15--2.19)
- SAS macros
(reference: The Little SAS Book , Chapter 7, and
Applied Statistics , p. 94).
- Assignment 12 (due Monday Apr 30).
Week 14: Apr 30, May 2
- Statistical data on the web.
- More on SAS/GRAPH; Maps.
Exam: May 9, 4:30 - 7:15
The final is comprehensive; it is not limited to material that has appeared
on previous quizzes. We have covered a lot of material in this course, and
so the exam can only be a sampling of what was covered.