Announcements
- July 29 (10:01 AM)
- I have finished grading the exams, but haven't done a lot with the project reports yet. Here is some info
about the exam grades:
- 4 students got scores in [77.4, 81.1],
- 2 students got scores in [65.4, 66.5],
- 2 students got scores in [58.0, 61.1],
- 3 students got scores in [47.7, 51.5],
- 2 students got scores of 43.9 and below.
I think that the four highest scores are pretty good considering the difficulty of the exam. Even the scores close
to 50% are acceptable. It was interesting grading ther exams --- no one portion of it seemed too tough for everyone,
and some students who finished close to the bottom did the best on some portions of the exam (really nailing some
answers better than anyone else did).
- July 25 (7:04 AM)
- I made some slight modifications to the
web page about boosting. It makes more sense now, because I noted that in
some situations classification using stumps can be stable, and in other situations it can be unstable (and have high
variance). I also added a tiny section on bag-boosting. I slightly modified the
web page about bagging to include a (brief) description of subbagging.
- July 24 (7:06 PM)
- Someone asked about me extending my office hours in order to answer questions about the material to be covered on the
exam. Unfortunately, I have some afternoon appointments on Tuesday that hem me in so that I can't start my class
office hours prior to 6 PM. But, I can be avaialable for questions on Monday, Wednesday, and Thursday.
I'll state that I'll start my class office hours at 5:30 PM on Thursday, but I'll just make individual appointments
to meet with people at other times. So, if you want to meet with me about the exam, or about your project, contact
me by e-mail to set up a time to meet. Good times would be 4-10 PM on Monday, 4-10 PM on Wednesday, and 3-5 PM on
Thursday (and then I'll have office hours for whoever wants to stop by from 5:30-7:00 PM on Thursday). If two or
three of you want to stop by at the same time, that will be fine --- just let me know when you want to meet.
- July 24 (5:42 AM)
- I didn't get around to posting an announcement on Thursday that shortly before class I added links on the
lecture supplements web page (under the heading for Lecture 12)
to notes for three presentations that I did related to TreeNet. Also, I added a link (under the heading for
Lecture 11) to some notes I wrote which summarize the coverage of neural network models in HTF. (I didn't present
these notes in class, but I added them to my web site for the sake of completeness.)
Just this morning, I updated the
web page pertaining to the final exam to give a fairly thorough description of
the exam. Finally, I have now e-mailed comments to all but one person who presented their projects on Thursday or
Friday (and I will try to get to the other person before I go to bed). So, if you presented your project already,
look for my e-mailed comments. (So far the project presentations have gone very well --- the timing has been good.
The content has also been generally rather good.)
- July 21 (9:57 AM)
- I added links on the
lecture supplements web page (under the heading for Lecture 12)
to some R examples showing the use of the adaboost function.
- July 19 (5:44 PM)
- I added a link on the
lecture supplements web page (under the heading for Lecture 11)
to some notes about PPR.
- July 19 (2:24 AM)
- I added a link on the
lecture supplements web page (under the heading for Lecture 12)
to some material that I will present in today's class (about boosting).
- July 18 (8:15 AM)
- I added a suggestion for something to do prior to Tuesday's lecture on the
home work web page.
- July 18 (2:59 AM)
- I added a link on the
lecture supplements web page (under the heading for Lecture 3 (since it goes
with the Lecture 3 material)) for an R demo featuring stepwise LDA for classification. (Stepwise LDA is a bit
similar to stepwise regression --- predictors can be added and removed in an attempt to identify the best set of
predictors to use for LDA classification.)
Also, I added some new information to the
web page pertaining to the projects. (The new information is in bright red.)
I also made some adjustments to the schedule of the project presentations; adding some information, and changing the
time for one person (as requested).
- July 17 (8:13 PM)
- I added a link on the
lecture supplements web page (under the heading for Lecture 11)
for an R demo which shows how to use projection pursuit regression (PPR). By going through the demo, it can be seen
that a PPR model does better than all of the other models investigated. (It can also be seen that PPR is easy to
use!)
- July 17 (8:36 AM)
- I added a link on the
lecture supplements web page (under the heading for Lecture 11)
to an updated R demo (pertaining to the example of p. 17 of HTF), which now includes classification by
support vector machines (which is often a very good method to try).
Later on today, I hope to add or modify some more R demos --- something to show how to use PPR, and perhaps something
to show how to use stepwise LDA.
- July 16 (8:38 PM)
- I posted an improved version of my presentation about Mallows' C_p, which is linked to from the
lecture supplements web page (under the heading for Lecture 9).
(The page that didn't scan in well has been fixed, and the spelling of Mallows' was corrected.) Also, I added a link
to a figure on my separating hyperplanes presentation which is linked to from the
lecture supplements web page (under the heading for Lecture 11).
- July 14 (9:34 AM)
- I posted a link to some material that I will present today (about separating hyperplanes and support vector
machines)
on the
lecture supplements web page (under the heading for Lecture 11).
On the web page that is linked to, I need to install a link to a figure (which I will do after I get the figure
scanned in).
- July 13 (9:30 PM)
- I posted a link to my presentation about Mallows' C_p
on the
lecture supplements web page (under the heading for Lecture 9).
One page didn't scan properly, and needs to be fixed, and also I plan to fix where I have Mallow's instead of
Mallows' in two places.
- July 13 (2:53 AM)
- I replaced the notes that didn't scan correctly with better versions. Also I edited the bagging and random
forests notes which I previously posted.
- July 12 (1:25 PM)
- I posted a link to my presentation about MARS
on the
lecture supplements web page (under the heading for Lecture 7).
Three of the pages are currently messed up (the drunken toddler again), but I hope to get them fixed.
- July 12 (8:16 AM)
- I posted a link to some notes about random forests
on the
lecture supplements web page (under the heading for Lecture 10).
(I intend to add (later) some information about applying a random forest (created by the RandomForests software) to
obtain predictions.)
- July 11 (6:31 PM)
- I posted links to some material related to last Thursday's lecture
on the
lecture supplements web page (under the heading for Lecture 9).
(p. 7 of one of the documents didn't scan properly --- it looks like a drunk 5 year old wrote it. Hopefully, I can
get it fixed soon.) Also, I edited the notes on bagging that I posted this morning --- I took out a few things that
didn't seem needed, and then added some new remarks at the end.
- July 11 (8:30 AM)
- I posted a link to some information about bagging
on the
lecture supplements web page (under the heading for Lecture 10).
- July 10 (8:58 PM)
- I added some suggestions for things to do prior to Tuesday's lecture on the
home work web page.
Also, I posted links to some new or modified demos on the
lecture supplements web page
(under the Lecture 9 heading). Since we only have three more lecture periods prior to the project presentations and
the final exam, I advise that you keep up (or catch up) with the suggested reading, and study the posted lecture
notes and the demos that I prepare, since I do want to keep moving a long at a fast pace and cover most of the items
that are on the syllabus. (Note: If I manage to cover the use of RandomForests and TreeNet, then I may
not get to cover Weka. This isn't so bad since I've discovered that a lot of the things that I once thought we
needed Weka for can be done using R.)
- July 9 (6:57 PM)
- I added some new information to the
web page pertaining to the projects.
The new stuff is in red, and included is the schedule for your project presentations. If there is a problem with when I
have you scheduled to present, bring it to my attention soon.
- July 7 (1:32 PM)
- I've now posted links to scanned versions of my notes about Linear Splines, Cubic Splines, and Smoothing Splines
on the
lecture supplements web page (under the headings for Lecture 6 and Lecture 7).
I believe this is everything so far except for the MARS notes, and I won't be able to post them until I can get some
more storage space (which I've been trying hard to do for well over a week now).
- July 7 (1:42 AM)
- I forgot to announce that Wednesday morning I posted a link to a scanned version of my Classification by Density
Estimation notes on the
lecture supplements web page (under the Lecture 8 heading).
- July 5 (6:41 AM)
- I posted a link on the
lecture supplements web page (under the Lecture 9 heading)
to a new R demo that I created pertaining to material that I will present in class on Thursday. On the
home work web page
I suggest that through the demo prior to class. Identify any questions that you may have pertaining to it, and ask
them on Thursday.
- July 4 (7:34 AM)
- I added some information about the scheduling of the oral project presentations on the
web page pertaining to the projects.
By no later than Thursday afternoon you should e-mail me some information about your project --- e.g., will it
pertain to classification or regression, what data set are you using, and on what day would you like to present it to
the class (or to my Friday afternoon seminar group). Also, I posted a bit of new stuff on the
lecture supplements web page
and the
home work web page.
Of particular interest, is that I've added some new parts to the R analysis I did of the situation introduced
in the example on p. 17 of HTF. The new material can be found in three places, each of which is clearly marked.
In one place, I refer to a problem that I encountered --- I can't get R to do what I want it to do, and I've
been able to make much sense of the error messages and warnings that I've run into. I'm hoping that one of you will
know how to fix the problem. (Also let me know if you are aware of any R functions that will do kernel
density estimation in dimensions higher than 2.)
Finally, I'll announce that even though we don't have class on Tuesday 7/5, I can make myself available to meet with
you about your project if you'd like to schedule an appointment (between 6 PM and 10 PM). I can perhaps find a bit
of time to meet with you on Wednesday if you cannot stop by on Tuesday. In any case, e-mail me to make a specific
appointment.
- June 29 (10:29 PM)
- I posted a link to a scanned version of my LDA presentation (from the 3rd lecture) on the
lecture supplements web page (under the Lecture 3 heading).
I think that I can eventually post scanned versions of the other lecture presentations that I've shown you that haven't been
posted previously, but I need to get more space for my web site.
- June 27 (8:27 AM)
- I posted links on the
lecture supplements web page
to some R examples that I want you to read over prior to Tuesday's class so that you can
ask about parts of them that you don't understand. (I will go over some of the important parts in class, but I won't
go through them function by function.) See the
home work web page
for a related exercise.
- June 24 (5:42 AM)
- I added a link to a web page version of my Basis Functions presentation to the
lecture supplements web page.
I hope to get some version of my material on splines posted eventually.
- June 24 (12:33 AM)
- There is now an improved version of the material for my talk on logistic regression (from the 3rd class meeting)
linked to from the
lecture supplements web page.
- June 23 (12:53 AM)
- I made a link to material for my talk on logistic regression (from the 3rd class meeting) on the
lecture supplements web page,
and I added a link to the CART Walkabout on the
home work web page.
- June 22 (7:23 AM)
- See the
home work web page
for what you should do prior to tomorrow's class. (I mentioned these things in class Tuesday night.)
Also, I added one more data set possibility (but it may be too small) to the
web page pertaining to the projects,
and I inserted a comment into one of my R examples linked to from the
lecture supplements web page
(about how one should typically standardize the predictors prior to using k-nearest-neighbors classification).
- June 21 (9:22 AM)
- I added a suggested problem to the
home work web page.
Also, I made some minor changes to some of the R examples I link to from the
lecture supplements web page,
but none of the changes affect the output from the various functions.
Finally, I updated the file used for the R demo during the first class, to make the comments much better and
to show how to use cross-validation with k-nearest-neighbors classification. I think that I'll cover some of
that briefly tonight, but please go through it all outside of class and ask questions about any parts of it that you
don't understand. Since part of the material addressed was covered on June 16, I linked to the file under both the
June 7 and June 16
headings on the
lecture supplements web page.
- June 20 (9:58 AM)
- I added a link to an R example (dealing with the lasso, ridge regression, and principal components
regression) to the
lecture supplements web page
(putting it under the heading for 6/16, since it goes with that lecture material).
I would like for you to go through this on your own prior to Tuesday's class. I will discuss it some in class, but
I'll depend on you to ask questions about any parts of it that you don't understand.
Also, I moved the link to the CART material, putting it under the heading for 6/21, since I will present the rest of
it (most of it) tomorrow. Finally, I created a new
web page pertaining to the projects, and have it linked from the home page.
I'll add more information about the projects later.
- June 19 (12:55 AM)
- Right before class on 6/16 I added links to new web pages (related to the 6/16 lecture) on the
lecture supplements web page, but I didn't announce them on this page.
Over the past few days I have slightly modified those web pages (based in part on good student comments made in
class), and I also added a link to a scanned version of my CART presentation. Finally, I modifed the R examples
for linear classifiers (also linked to from the
lecture supplements web page)
to show how to do hypothesis tests for logistic regression models, and to include an example for which the predictors
have nonnormal distributions.
- June 14 (7:02 PM)
- I added a new R demo (linked to under the June 14
heading on the
lecture supplements web page). Also, I added some more information about
downloading the Salford Systems software on
this web page. Read this new information if you encounter problems
downloading the software.
- June 13 (7:24 AM)
- I added a new R demo (linked to under the June 9
heading on the
lecture supplements web page). This demo covers topics that I talked about
last Thursday, but didn't put into the previously discussed regression demo. Please go over it on your own prior to
Tuesday's class and ask questions in class about specific parts of it --- I'm not going to go through it line by line
since at this point you should be able to grasp most of it without a lot of verbal explanation. In class on Tues I
will cover LDA, QDA, and logistic regression --- somewhat simple methods for classification. Then on Thurs, I will
tie up loose ends from what I addressed in the previous three classes. I will delay the two CART lectures until next
week (which will give you next weekend to do a lot of reading in the CART book).
- June 11 (6:33 AM)
- I edited the both of the web pages that I went over in class on Thursday (linked to under the June 9
heading on the
lecture supplements web page). The major change was that I added information
about component plus residual plots at the bottom of the
OLS Regression web page. Clearly, there is lots more that I could add to all
of the previously posted lecture supplements, but I need to resist doing too much of that and focus on preparing new
supplements for next week's lectures.
- June 10 (5:52 PM)
- I have now sent Salford Systems the list of student names. Please go to
this web page to obtain information about downloading and installing the
software. (Do this within the next few days so that any problems can addressed before I start covering the use
of the software in class.) Please read my instructions carefully so that things will hopefully go smoothly.
(Also, don't give the information about downloading the 90 day versions of the software to anyone. (Only people with
names matching those I sent Salford Systems can get the special unlock codes.))
- June 10 (2:18 AM)
- Right before class on Thursday evening, I posted links to the two web pages I went over in class
(OLS basics & R code for regression) on the
lecture supplements web page.
- June 9 (2:35 AM)
- I added to exercises to the
home work web page,
and also added some links to web pages having lots of data sets (some of which may be good for your projects) to the
home page for this course. In class today I'll lecture about the basics of
ordinary least squares regression, give an R demo about regression, and also continue with the classification
example I started last time. Please remind me to say more about grading, and the projects.
- June 7 (4:49 PM)
- I finished another lecture supplement (R commands for in-class demo) that is linked to from the
lecture supplements web page.
- June 7 (12:46 PM)
- I finished another lecture supplement that is linked to from the
lecture supplements web page.
- June 7 (6:56 AM)
- I created a
web page with links to lecture supplements. So far, I've only finished the linked
web page on Bayes Classifiers. (The power outage at GMU really messed me up --- my part of campus has been w/o power
since about 4:15 PM on Monday. (Tonight's lecture may be a bit rough in places.))
- June 6 (5:38 AM)
- Please see the
welcome page for advice about what you should do to prepare for the first
lecture.