Instructor: James Gentle

TA: Ruihua Xu

Office hours: Mondays 6:00pm--7:00pm; Fridays 10:00am--11:00am.

Room 146, Science & Technology II.

If you send email to the instructor or to the TA , please put "Stat 362" or "IT 362" in the subject line.

This web page and the links to notes will evolve as the semester progresses. In particular, links to solutions to assignments and quizzes will be posted at appropriate times (after the fact, of course!).

The general description of the course is available at www.scs.gmu.edu/~jgentle/stat362/

The references below refer to

- Ronald P. Cody and Jeffrey K. Smith (2006), Applied Statistics and the SAS Programming Language, Fifth Edition, Pearson Prentice Hall, Upper Saddle River, NJ.
- Lora D. Delwiche and Susan J. Slaughter (2003), The Little SAS Book, Third Edition, SAS Institute Inc., Cary, NC.

Note the websites for the books, where sample programs and datasets are available:

The final exam is May 9 at 4:30.

- Computer systems and organization; applications software.
- Data organization for statistical analysis.

The "standard" rectangular array- Observations; rows ("cases", "instances", "records", etc.)
- Variables; columns ("features", "attributes", "fields", etc.)

- Introduction to the SAS software system

References: Applied Statistics , Chapter 1, and The Little SAS Book , Chapter 1.- Starting SAS
- The SAS DATA step
- Defining variables and entering data
- Types of data; numeric data, character data
- Creating new variables; the assignment statement
- Simple SAS PROCs: PRINT, MEANS, FREQ, CORR, and SORT

- Assignment 1: Exercises 1.2, 1.4, and 1.8 in Applied Statistics (due Monday Jan 29).
- Solution

- More on the SAS DATA step
- Missing values
- More on the INPUT statement
- Input from an ASCII data file
(Reference:
*The Little SAS Book*, Chapter 2). - DROP/KEEP
- Input from another SAS dataset; the SET statement
- Conditional execution
(reference:
*The Little SAS Book*, Sections 3.4 and 3.5)

- Review of basic statistics
(Reference: Applied Statistics , Chapter 2.)
- Descriptive statistics
- Populations and parameters; Samples and statistics:
- Statistical inference
- Hypothesis testing (t tests); one-sided tests, two-sided tests
- Estimation: confidence intervals

- qq plots

- More about PROCs
- PROC SORT (reference:
*The Little SAS Book*, Section 4.3).- Sortkeys
- The BY statement

- The WHERE statement
(reference:
*The Little SAS Book*, pages 102, 103, 306, and 307) - PROC MEANS and PROC UNIVARIATE (reference:
*The Little SAS Book*, Sections 4.9, 8.1, and 8.2).- SAS statements
- t tests in SAS

- PROC SORT (reference:
- Assignment 2: Exercises 2.2, 2.4, and 2.8 in Applied Statistics (due Monday Feb 5).
- Solution

- More on SAS datasets
(reference:
*The Little SAS Book*, Sections 2.19-2.22 and Sections 3.1-3.3).- External files
- Saving and reusing SAS datasets
- Additional statements in the SAS DATA step:
- SAS functions (reference:
*Applied Statistics*, Chapters 17)- Simple functions (SQRT, LOG, RANUNI, etc.)
- Functions to compute statistics within observations

- LABEL, FORMAT
- OUTPUT/DELETE, DROP/KEEP, RENAME

- SAS functions (reference:
- Multiple datasets within one DATA step
- Working with subsets of a dataset; OBS, FIRSTOBS

- Making programs work correctly; reading assignment:
*The Little SAS Book*, Chapter 10 - Assignment 3 (due Monday Feb 12).
- Solution

- Graphs for understanding data
- Graphs in SAS
- Wednesday, Feb 14: Class canceled due to weather.
- Assignment 4: Exercises 2.10 and 2.12 in Applied Statistics (due Monday Feb 19).
- Solution

- Monday, Feb 19
- Combining and manipulating SAS datasets
(reference:
*The Little SAS Book*, Chapter 6).

Concatenation, interleaving, merging, updating. - Quiz 1 (50 minutes; material through week 4).

Here's a sample from a previous year. Of course yours will be different!! (There may also be some slight differences in coverage this semester.)

- Combining and manipulating SAS datasets
(reference:
- Wednesday, Feb 21
- Review of Quiz 1. Version a . Version b . Version c .
- More on concatenation, interleaving, merging, amd updating of SAS datasets.
- PROC FORMAT; custom formats, recoding

(references:*The Little SAS Book*, Section 4.7, and Applied Statistics , Chapter 3). - Still more on the DATA step

(reference:*The Little SAS Book*, Chapters 2, 3, and 4).- Lagged variables
- Automatic variables
- Arrays (reference:
*Applied Statistics*, Chapters 15) - DO loops (reference:
*Applied Statistics*, page 419) - Labels
- More on the INPUT statement and controlling output
- Character data
- PUT and FILE statements

- Assignment 5 (due Monday Feb 26).
- Solution

- Categorical data (Reference: Applied Statistics , Chapter 3.)
- The Output Delivery System
(reference:
*The Little SAS Book*, Chapter 5). - Assignment 6 lagged variables, loops, etc., and Exercises 3.3, 3.6, and 3.7 in Applied Statistics (due Monday Mar 5).
- Solution

- More on the analysis of categorical data; interpretation of SAS output.
(Reference: Applied Statistics , Chapter 3.)
- Are two categorical variables related (or are they "independent")? Chi-squared.
- Unmatched case-control studies. Chi-squared and odds ratio.
- Matched case-control studies. McNemar test.
- Agreement. Kappa.
- Linear trend. Mantel-Haenszel chi-squared.

- Assignment 7: Exercises 3.9, 3.10, 3.11, and 3.15 in Applied Statistics (due Monday Mar 19).
- Solution

- Review: manipulations of SAS datasets in the DATA step; PROC FORMAT; categorical data.
- Wednesday, Mar 21 Quiz 2

- Review of Quiz 2. Version a . Version b . Version c . Version d . Version e .
- Correlation and regression
- Correlation and regression
- Correlation
- Simple regression models
- Basic model fitting
- AOV table and F test
- Coefficient estimates and t tests

- Regression in SAS (reference:
*The Little SAS Book*, Sections 8.5 and 8.6).- PROC REG statements
- MODEL statement
- P option
- BY statement
- PLOT statement
- P. and R. keywords
- OVERLAY option

- Output of PROC REG

- PROC REG statements
- Assignment 8
- SAS program.

- More on correlation and regression analysis
- Measuring variation and co-variation
- More on regression analysis
- Inspection of residuals -- patterns; polynomial regression
- Confidence intervals; prediction intervals
- Regression fitting
- Adequacy of fit: R-squared

- Assignment 9: Exercises 5.1, 5.5, 5.7, 5.9, and 5.10 in Applied Statistics (due Monday Apr 9).
- SAS program.

- More on multiple linear regression
- Regression model building
- Adequacy of fit: R-squared, adjusted R-squared
- Selection of variables in regression.

- Assignment 10 (due Monday Apr 16).
- SAS program.

- Regression models with categorical covariates; "dummy" variables.
- Assignment 11: Exercises 9.1 and 9.2 in Applied Statistics (due Monday Apr 23).

- Monday, Apr 23
- Quiz 3 (regression analysis).

- Wednesday, Apr 25
- More on SAS datasets:
- the pointer in the INPUT statement
- informats/formats
- date data (reference:
*The Little SAS Book*, Sections 3.7, 3.8) - SQL (reference:
*The Little SAS Book*, Appendix F) - importing and exporting SAS datasets (reference:
*The Little SAS Book*, Sections 2.15--2.19)

- SAS macros
(reference:
*The Little SAS Book*, Chapter 7, and*Applied Statistics*, p. 94).

- More on SAS datasets:
- Assignment 12 (due Monday Apr 30).

- Statistical data on the web.
- More on SAS/GRAPH; Maps.