Announcements


July 29 (10:01 AM)
I have finished grading the exams, but haven't done a lot with the project reports yet. Here is some info about the exam grades: I think that the four highest scores are pretty good considering the difficulty of the exam. Even the scores close to 50% are acceptable. It was interesting grading ther exams --- no one portion of it seemed too tough for everyone, and some students who finished close to the bottom did the best on some portions of the exam (really nailing some answers better than anyone else did).
July 25 (7:04 AM)
I made some slight modifications to the web page about boosting. It makes more sense now, because I noted that in some situations classification using stumps can be stable, and in other situations it can be unstable (and have high variance). I also added a tiny section on bag-boosting. I slightly modified the web page about bagging to include a (brief) description of subbagging.
July 24 (7:06 PM)
Someone asked about me extending my office hours in order to answer questions about the material to be covered on the exam. Unfortunately, I have some afternoon appointments on Tuesday that hem me in so that I can't start my class office hours prior to 6 PM. But, I can be avaialable for questions on Monday, Wednesday, and Thursday. I'll state that I'll start my class office hours at 5:30 PM on Thursday, but I'll just make individual appointments to meet with people at other times. So, if you want to meet with me about the exam, or about your project, contact me by e-mail to set up a time to meet. Good times would be 4-10 PM on Monday, 4-10 PM on Wednesday, and 3-5 PM on Thursday (and then I'll have office hours for whoever wants to stop by from 5:30-7:00 PM on Thursday). If two or three of you want to stop by at the same time, that will be fine --- just let me know when you want to meet.
July 24 (5:42 AM)
I didn't get around to posting an announcement on Thursday that shortly before class I added links on the lecture supplements web page (under the heading for Lecture 12) to notes for three presentations that I did related to TreeNet. Also, I added a link (under the heading for Lecture 11) to some notes I wrote which summarize the coverage of neural network models in HTF. (I didn't present these notes in class, but I added them to my web site for the sake of completeness.) Just this morning, I updated the web page pertaining to the final exam to give a fairly thorough description of the exam. Finally, I have now e-mailed comments to all but one person who presented their projects on Thursday or Friday (and I will try to get to the other person before I go to bed). So, if you presented your project already, look for my e-mailed comments. (So far the project presentations have gone very well --- the timing has been good. The content has also been generally rather good.)
July 21 (9:57 AM)
I added links on the lecture supplements web page (under the heading for Lecture 12) to some R examples showing the use of the adaboost function.
July 19 (5:44 PM)
I added a link on the lecture supplements web page (under the heading for Lecture 11) to some notes about PPR.
July 19 (2:24 AM)
I added a link on the lecture supplements web page (under the heading for Lecture 12) to some material that I will present in today's class (about boosting).
July 18 (8:15 AM)
I added a suggestion for something to do prior to Tuesday's lecture on the home work web page.
July 18 (2:59 AM)
I added a link on the lecture supplements web page (under the heading for Lecture 3 (since it goes with the Lecture 3 material)) for an R demo featuring stepwise LDA for classification. (Stepwise LDA is a bit similar to stepwise regression --- predictors can be added and removed in an attempt to identify the best set of predictors to use for LDA classification.) Also, I added some new information to the web page pertaining to the projects. (The new information is in bright red.) I also made some adjustments to the schedule of the project presentations; adding some information, and changing the time for one person (as requested).
July 17 (8:13 PM)
I added a link on the lecture supplements web page (under the heading for Lecture 11) for an R demo which shows how to use projection pursuit regression (PPR). By going through the demo, it can be seen that a PPR model does better than all of the other models investigated. (It can also be seen that PPR is easy to use!)
July 17 (8:36 AM)
I added a link on the lecture supplements web page (under the heading for Lecture 11) to an updated R demo (pertaining to the example of p. 17 of HTF), which now includes classification by support vector machines (which is often a very good method to try). Later on today, I hope to add or modify some more R demos --- something to show how to use PPR, and perhaps something to show how to use stepwise LDA.
July 16 (8:38 PM)
I posted an improved version of my presentation about Mallows' C_p, which is linked to from the lecture supplements web page (under the heading for Lecture 9). (The page that didn't scan in well has been fixed, and the spelling of Mallows' was corrected.) Also, I added a link to a figure on my separating hyperplanes presentation which is linked to from the lecture supplements web page (under the heading for Lecture 11).
July 14 (9:34 AM)
I posted a link to some material that I will present today (about separating hyperplanes and support vector machines) on the lecture supplements web page (under the heading for Lecture 11). On the web page that is linked to, I need to install a link to a figure (which I will do after I get the figure scanned in).
July 13 (9:30 PM)
I posted a link to my presentation about Mallows' C_p on the lecture supplements web page (under the heading for Lecture 9). One page didn't scan properly, and needs to be fixed, and also I plan to fix where I have Mallow's instead of Mallows' in two places.
July 13 (2:53 AM)
I replaced the notes that didn't scan correctly with better versions. Also I edited the bagging and random forests notes which I previously posted.
July 12 (1:25 PM)
I posted a link to my presentation about MARS on the lecture supplements web page (under the heading for Lecture 7). Three of the pages are currently messed up (the drunken toddler again), but I hope to get them fixed.
July 12 (8:16 AM)
I posted a link to some notes about random forests on the lecture supplements web page (under the heading for Lecture 10). (I intend to add (later) some information about applying a random forest (created by the RandomForests software) to obtain predictions.)
July 11 (6:31 PM)
I posted links to some material related to last Thursday's lecture on the lecture supplements web page (under the heading for Lecture 9). (p. 7 of one of the documents didn't scan properly --- it looks like a drunk 5 year old wrote it. Hopefully, I can get it fixed soon.) Also, I edited the notes on bagging that I posted this morning --- I took out a few things that didn't seem needed, and then added some new remarks at the end.
July 11 (8:30 AM)
I posted a link to some information about bagging on the lecture supplements web page (under the heading for Lecture 10).
July 10 (8:58 PM)
I added some suggestions for things to do prior to Tuesday's lecture on the home work web page. Also, I posted links to some new or modified demos on the lecture supplements web page (under the Lecture 9 heading). Since we only have three more lecture periods prior to the project presentations and the final exam, I advise that you keep up (or catch up) with the suggested reading, and study the posted lecture notes and the demos that I prepare, since I do want to keep moving a long at a fast pace and cover most of the items that are on the syllabus. (Note: If I manage to cover the use of RandomForests and TreeNet, then I may not get to cover Weka. This isn't so bad since I've discovered that a lot of the things that I once thought we needed Weka for can be done using R.)
July 9 (6:57 PM)
I added some new information to the web page pertaining to the projects. The new stuff is in red, and included is the schedule for your project presentations. If there is a problem with when I have you scheduled to present, bring it to my attention soon.
July 7 (1:32 PM)
I've now posted links to scanned versions of my notes about Linear Splines, Cubic Splines, and Smoothing Splines on the lecture supplements web page (under the headings for Lecture 6 and Lecture 7). I believe this is everything so far except for the MARS notes, and I won't be able to post them until I can get some more storage space (which I've been trying hard to do for well over a week now).
July 7 (1:42 AM)
I forgot to announce that Wednesday morning I posted a link to a scanned version of my Classification by Density Estimation notes on the lecture supplements web page (under the Lecture 8 heading).
July 5 (6:41 AM)
I posted a link on the lecture supplements web page (under the Lecture 9 heading) to a new R demo that I created pertaining to material that I will present in class on Thursday. On the home work web page I suggest that through the demo prior to class. Identify any questions that you may have pertaining to it, and ask them on Thursday.
July 4 (7:34 AM)
I added some information about the scheduling of the oral project presentations on the web page pertaining to the projects. By no later than Thursday afternoon you should e-mail me some information about your project --- e.g., will it pertain to classification or regression, what data set are you using, and on what day would you like to present it to the class (or to my Friday afternoon seminar group). Also, I posted a bit of new stuff on the lecture supplements web page and the home work web page. Of particular interest, is that I've added some new parts to the R analysis I did of the situation introduced in the example on p. 17 of HTF. The new material can be found in three places, each of which is clearly marked. In one place, I refer to a problem that I encountered --- I can't get R to do what I want it to do, and I've been able to make much sense of the error messages and warnings that I've run into. I'm hoping that one of you will know how to fix the problem. (Also let me know if you are aware of any R functions that will do kernel density estimation in dimensions higher than 2.) Finally, I'll announce that even though we don't have class on Tuesday 7/5, I can make myself available to meet with you about your project if you'd like to schedule an appointment (between 6 PM and 10 PM). I can perhaps find a bit of time to meet with you on Wednesday if you cannot stop by on Tuesday. In any case, e-mail me to make a specific appointment.
June 29 (10:29 PM)
I posted a link to a scanned version of my LDA presentation (from the 3rd lecture) on the lecture supplements web page (under the Lecture 3 heading). I think that I can eventually post scanned versions of the other lecture presentations that I've shown you that haven't been posted previously, but I need to get more space for my web site.
June 27 (8:27 AM)
I posted links on the lecture supplements web page to some R examples that I want you to read over prior to Tuesday's class so that you can ask about parts of them that you don't understand. (I will go over some of the important parts in class, but I won't go through them function by function.) See the home work web page for a related exercise.
June 24 (5:42 AM)
I added a link to a web page version of my Basis Functions presentation to the lecture supplements web page. I hope to get some version of my material on splines posted eventually.
June 24 (12:33 AM)
There is now an improved version of the material for my talk on logistic regression (from the 3rd class meeting) linked to from the lecture supplements web page.
June 23 (12:53 AM)
I made a link to material for my talk on logistic regression (from the 3rd class meeting) on the lecture supplements web page, and I added a link to the CART Walkabout on the home work web page.
June 22 (7:23 AM)
See the home work web page for what you should do prior to tomorrow's class. (I mentioned these things in class Tuesday night.) Also, I added one more data set possibility (but it may be too small) to the web page pertaining to the projects, and I inserted a comment into one of my R examples linked to from the lecture supplements web page (about how one should typically standardize the predictors prior to using k-nearest-neighbors classification).
June 21 (9:22 AM)
I added a suggested problem to the home work web page. Also, I made some minor changes to some of the R examples I link to from the lecture supplements web page, but none of the changes affect the output from the various functions. Finally, I updated the file used for the R demo during the first class, to make the comments much better and to show how to use cross-validation with k-nearest-neighbors classification. I think that I'll cover some of that briefly tonight, but please go through it all outside of class and ask questions about any parts of it that you don't understand. Since part of the material addressed was covered on June 16, I linked to the file under both the June 7 and June 16 headings on the lecture supplements web page.
June 20 (9:58 AM)
I added a link to an R example (dealing with the lasso, ridge regression, and principal components regression) to the lecture supplements web page (putting it under the heading for 6/16, since it goes with that lecture material). I would like for you to go through this on your own prior to Tuesday's class. I will discuss it some in class, but I'll depend on you to ask questions about any parts of it that you don't understand. Also, I moved the link to the CART material, putting it under the heading for 6/21, since I will present the rest of it (most of it) tomorrow. Finally, I created a new web page pertaining to the projects, and have it linked from the home page. I'll add more information about the projects later.
June 19 (12:55 AM)
Right before class on 6/16 I added links to new web pages (related to the 6/16 lecture) on the lecture supplements web page, but I didn't announce them on this page. Over the past few days I have slightly modified those web pages (based in part on good student comments made in class), and I also added a link to a scanned version of my CART presentation. Finally, I modifed the R examples for linear classifiers (also linked to from the lecture supplements web page) to show how to do hypothesis tests for logistic regression models, and to include an example for which the predictors have nonnormal distributions.
June 14 (7:02 PM)
I added a new R demo (linked to under the June 14 heading on the lecture supplements web page). Also, I added some more information about downloading the Salford Systems software on this web page. Read this new information if you encounter problems downloading the software.
June 13 (7:24 AM)
I added a new R demo (linked to under the June 9 heading on the lecture supplements web page). This demo covers topics that I talked about last Thursday, but didn't put into the previously discussed regression demo. Please go over it on your own prior to Tuesday's class and ask questions in class about specific parts of it --- I'm not going to go through it line by line since at this point you should be able to grasp most of it without a lot of verbal explanation. In class on Tues I will cover LDA, QDA, and logistic regression --- somewhat simple methods for classification. Then on Thurs, I will tie up loose ends from what I addressed in the previous three classes. I will delay the two CART lectures until next week (which will give you next weekend to do a lot of reading in the CART book).
June 11 (6:33 AM)
I edited the both of the web pages that I went over in class on Thursday (linked to under the June 9 heading on the lecture supplements web page). The major change was that I added information about component plus residual plots at the bottom of the OLS Regression web page. Clearly, there is lots more that I could add to all of the previously posted lecture supplements, but I need to resist doing too much of that and focus on preparing new supplements for next week's lectures.
June 10 (5:52 PM)
I have now sent Salford Systems the list of student names. Please go to this web page to obtain information about downloading and installing the software. (Do this within the next few days so that any problems can addressed before I start covering the use of the software in class.) Please read my instructions carefully so that things will hopefully go smoothly. (Also, don't give the information about downloading the 90 day versions of the software to anyone. (Only people with names matching those I sent Salford Systems can get the special unlock codes.))
June 10 (2:18 AM)
Right before class on Thursday evening, I posted links to the two web pages I went over in class (OLS basics & R code for regression) on the lecture supplements web page.
June 9 (2:35 AM)
I added to exercises to the home work web page, and also added some links to web pages having lots of data sets (some of which may be good for your projects) to the home page for this course. In class today I'll lecture about the basics of ordinary least squares regression, give an R demo about regression, and also continue with the classification example I started last time. Please remind me to say more about grading, and the projects.
June 7 (4:49 PM)
I finished another lecture supplement (R commands for in-class demo) that is linked to from the lecture supplements web page.
June 7 (12:46 PM)
I finished another lecture supplement that is linked to from the lecture supplements web page.
June 7 (6:56 AM)
I created a web page with links to lecture supplements. So far, I've only finished the linked web page on Bayes Classifiers. (The power outage at GMU really messed me up --- my part of campus has been w/o power since about 4:15 PM on Monday. (Tonight's lecture may be a bit rough in places.))
June 6 (5:38 AM)
Please see the welcome page for advice about what you should do to prepare for the first lecture.