CSI 991 Section 001
The seminar will be conducted in the form of a journal club.
The objective is three-fold:
Students who are enrolled in the class will be required to make two 30-to-45-minute presentations, each summarizing a recent article in computational learning or computational statistics. ("Recent" means 2005 or later.)
A brief written report is also required. This can be in the form of straight text (5 or 6 pages) or a copy of the presentation slides.
The presentations can be simple summaries, or, preferably, critical reviews citing other work or possible approaches. Monte Carlo studies or applications on sample datasets would be nice.
Publication of the results of research is generally motivated by personal ambition, rather than by desire to advance science. Personal ambition may include just the satisfaction of becoming better-known within the scientific community and the advancement in one's field of employment. These factors contribute to the plethora of publications. The plethora results in standards for evaluating the merits of different venues of publication.
Different fields of science place differing values on various types of publications. In the mathematics and the legacy sciences, "archival" journals have always been at the top. In computer science and various related areas, conference proceedings have risen to the top. Within the class of journals, "impact factors" or various other measures serve to order the journals within a given field. Conferences are often ranked based on "acceptance rate", in which the numerator is the number of papers presented and the denominator is some number determined by the conference organizers, ostensibly related to the number of papers that were submitted for possible presentation.
The form of the bibliographic entry usually depends on the type of publication being referenced, whether it is a book or other type of stand-alone publication, an article in a journal, an article in an edited book, an article in a conference proceedings, a webpage, or some other type of publication.
Webpages and some other types of publication require an additional bit of information: the time when it was accessed (usually just the date of access). This is necessary because the authors of webpages can change the contents without changing the URL.
In a book, the author usually can choose the form to use. In conference proceedings and journals, the organizers or the editors may allow the individual articles to use different forms of bibliographic entries or they may require all articles to use some specified form. Some people feel that use of abbreviations in authors' first names and in the name of the journal raises the level of scholarship.
The character string is divided into two parts, separated by a forward slash. The first part of the string indicates the organization that registered the file, and the second part is the unique string assigned to the object. For example, in the DOI 10.1080/0952813X.2010.505800 the first part indicates the Taylor & Francis Group, which is a large publishing house that publishes a number of journals (Including Journal of the American Statistical Association) and books. The second part indicates a specific article. The format of the second part is decided by the registering organization.
A registry is
maintained by a consortium organization, called the International DOI Foundation,
that maintains information about where the object is stored on the internet.
To find the document in the example DOI above, use
This takes you to a site that is devoted to that specific document. That site is "free", but to access the document, some privilege, such as a subscription or direct payment, may be required. In some cases, your GMU credentials may take you to the document. In other cases, even if GMU has full access rights, you can't get there from here; you have to go to a GMU site and proceed to the location of the document.
Here are some examples of the form that I prefer for various type of entries from the oldest sources I could find.
Note the general arrangement of the various fields, the various punctuation marks used, the use of exact names as in the published source, and so on. This is the style I use. You can use whatever you want so long as it conforms to the rules of whoever publishes it and so long as it contains all of the relevant fields.
Another type of citation that is very popular in computer science and various related areas is to assign each bibliography entry a hash code of numerals and letters based on the author(s) name(s), the date, and/or other parts of the bibliographic entry.
Another simple type of citation is the author(s) last name(s) and the date of publication, possibly with an appended "a", "b", etc., enclosed in parentheses. This is the method I prefer and use, because when reading the article, if I'm somewhat familiar with the literature, the citation tells me what the reference is without my having to go to the bibliography.
There is a nice package in TeX for producing a bibliography called bibtex. There are other nice packages that tie in with bibtex for making citations. I use these packages most of the time.
Assignment 1; due January 30: Select an article for your first presentation.
Email bibliographic info (author, year, title, journal/proceedings name, page numbers)
to instructor. Please use an appropriate bibliographic format.
Presentation by Christine Harvey.
Harvey, Christine; Scott Rosen; James Ramsey; Christopher Saunders; and Samar K. Guharay (2014), Computationally and statistically efficient model fitting techniques, presented at the PyHPC Workshop at the International Conference on High Performance Computing, Networking, Storage and Analysis, New Orleans.
http://www.dlr.de/sc/Portaldata/15/Resources/dokumente/pyhpc2014/submissions/pyhpc2014_submission_6.pdf Accessed February 4, 2015.
Presentation by Francis Opoku.
Quinn C. Payton, Michael G. McManus, Marc H. Weber, Anthony R. Olsen, and Thomas M. Kincaid (2015). micromap: A Package for Linked Micromaps. Journal of Statistical Software, 63(2), 1-16.
Presentation by Rebecca Wagner.
Wegman, Edward J., and Rida E. A. Moustafa (2006), Generalizations of parallel coordinates, in Graphics of Large Datasets: Visualizing a Million, (Antony Unwin, Martin Theus, Heike Hofmann, eds.) 143--156.
Presentation by Seunghye Wilson.
Goldstein, Rhy; Michael Glueck; and Azam Khan (2011), Real-time compression of time series building performance data, Proceedings of Buildings Simulation 2011 12th Conference of International Building Performance Simulation Association, Sydney.
Presentation by Andrew Sharp.
Tompaidis, S., & Yang, C. (2014). Pricing American-style options by Monte Carlo simulation: Alternatives to ordinary least squares. The Journal of Computational Finance, 18(1), 121-143.
Retrieved from search.proquest.com/docview/1629026629?accountid=14541
Presentation by Benjamin Hess.
Singh, Rajni Ranjan and Deepak Singh Tomar (2015). Network forensics: Detection and analysis of stealth port scanning attack, International Journal of Computer Networks and Communications Security 3, 33--42.
Presentation by Christine Harvey.
Hong Wan; Ankenman, B.; Nelson, B.L. (2003), Controlled sequential bifurcation: a new factor-screening method for discrete-event simulation,Proceedings of the 2003 Winter Simulation Conference, 565--573.
Presentation by Seunghye Wilson.