CSI 991 Section 001
Contact:
jgentle@gmu.edu
The seminar will be conducted in the form of a journal club, in which recent research articles are summarized.
The seminar may also include discussions, tutorials, and demonstrations of current computational methodology.
The objective of the journal club is three-fold:
The objective of tutorials and demonstrations of computational methodology is to introduce seminar participants to current developments and practices that may not be covered in regular CSI courses. Possible topics include
Attendance at the seminar is open to anyone.
Students may enroll in CSI 991, Section 001, for one hour credit.
The course is graded as "S" or "U".
In order to receive a grade of "S", students who are enrolled in the class
will be required to
One of the reguired presentations must be on a recent article in computational learning or computational statistics. ("Recent" means 2005 or later.) The other presentation can be another article or it can be on a computational tool or facility, and in that case some actual computing or communication demonstrated.
A brief written report is also required. This can be in the form of straight text (5 or 6 pages) or a copy of the presentation slides.
The presentations on research articles can be simple summaries, or, preferably, critical reviews citing other work or possible approaches. Monte Carlo studies or applications on sample datasets would be nice.
Publication of the results of research is generally motivated by personal ambition, rather than by desire to advance science. Personal ambition may include just the satisfaction of becoming better-known within the scientific community and the advancement in one's field of employment. These factors contribute to the plethora of publications. The plethora results in a wide range in the quality of publications. It is an unfortunate fact that almost anything can be published somewhere.
Different fields of science place differing values on various types of publications. In the mathematics and the legacy sciences, "archival" journals have always been at the top. In computer science and various related areas, conference proceedings have risen to the top. Within the class of journals, "impact factors" or various other measures serve to order the journals within a given field. Conferences are often ranked based on "acceptance rate", in which the numerator is the number of papers presented and the denominator is some number determined by the conference organizers, ostensibly related to the number of papers that were submitted for possible presentation.
The form of the bibliographic entry usually depends on the type of publication being referenced, whether it is a book or other type of stand-alone publication, an article in a journal, an article in an edited book, an article in a conference proceedings, a webpage, or some other type of publication.
Webpages and some other types of publication require an additional bit of information: the time when it was accessed (usually just the date of access). This is necessary because the authors of webpages can change the contents without changing the URL.
In a book, the author usually can choose the form to use. In conference proceedings and journals, the organizers or the editors may allow the individual articles to use different forms of bibliographic entries or they may require all articles to use some specified form. Some people feel that use of abbreviations in authors' first names and in the name of the journal raises the level of scholarship.
The character string is divided into two parts, separated by a forward slash. The first part of the string indicates the organization that registered the file, and the second part is the unique string assigned to the object. For example, in the DOI 10.1080/0952813X.2010.505800 the first part indicates the Taylor & Francis Group, which is a large publishing house that publishes a number of journals (Including Journal of the American Statistical Association) and books. The second part indicates a specific article. The format of the second part is decided by the registering organization.
A registry is
maintained by a consortium organization, called the International DOI Foundation,
that maintains information about where the object is stored on the internet.
To find the document in the example DOI above, use
https://dx.doi.org/10.1080/0952813X.2010.505800
This takes you to a site that is devoted to that specific document. That site
is "free", but to
access the document, some privilege, such as a subscription or direct
payment, may be required. In some cases, your GMU credentials may take you to
the document.
In other cases, even if GMU has full access rights, you can't get there from
here; you have to go to a GMU site and proceed to the location of the document.
Here are some examples of the form that I prefer for various type of entries from the oldest sources I could find.
Note the general arrangement of the various fields, the various punctuation marks used, the use of exact names as in the published source, and so on. This is the style I use. You can use whatever you want so long as it conforms to the rules of whoever publishes it and so long as it contains all of the relevant fields.
Another type of citation that is very popular in computer science and various related areas is to assign each bibliography entry a hash code of numerals and letters based on the author(s) name(s), the date, and/or other parts of the bibliographic entry.
Another simple type of citation is the author(s) last name(s) and the date of publication, possibly with an appended "a", "b", etc., enclosed in parentheses. This is the method I prefer and use, because when reading the article, if I'm somewhat familiar with the literature, the citation tells me what the reference is without my having to go to the bibliography.
There is a nice package in TeX for producing a bibliography called bibtex. There are other nice packages that tie in with bibtex for making citations. I use these packages most of the time.
Email bibliographic info (author, year, title, journal/proceedings name,
page numbers) for your first presentation to the instructor.
Please use an appropriate bibliographic format.
If the presentation is on a recent research article, email bibliographic info (author, year, title, journal/proceedings name, page numbers) to the instructor. Please use an appropriate bibliographic format.
If the presentation is to be on a computational tool or facility, briefly describe what you propose to discuss and indicate the format of your presentation (demo, code walk-through, etc.).
4:05pm
Presentation by William Ampeh.
Chaovalit, Pimwadee; Aryya Gangopadhyay; George Karabatis;
and Zhiyuan Chen (2011),
Discrete wavelet transform-based time series analysis and mining,
ACM Computing Surveys, article no. 6.
4:45pm
Presentation by John Leung.
Bobadilla, J.; F. Ortega; A. Hernando; and A. Gutiérrez (2013),
Recommender systems survey,
Knowledge-Based Systems, 46, 109--132.
4:10pm
Presentation by William Basinger.
Erdin,Rebekka; Christoph Frei; and Hans Kuensch (2012),
Data transformation and uncertainty in geostatistical combination of
radar and rain gauges,
Journal of Hydrometeorology, 1332--1346.
4:40pm
Presentation by Redouane Betrouni.
Wu, Wenyan; Robert J. May; Holger R. Maier, and Graeme C. Dandy (2013),
On a benchmarking approach for comparing data splitting methods for modeling
water resources parameters using artificial neural networks,
Water Resources Research 49, 7598--7614.
The presentations this week, and some in subsequent weeks, are on deep learning,
which, aside from just its catchy name, does offer some additional analytical
power. The common type of deep learning methods are based on neural nets (or
``artificial'' neural nets, ANN). Their main characteristic is the use of
multiple hidden layers, but with modifications to prevent overfitting, "convolutions".
William Ampeh has provided the following links for information about R packages
that implement neural nets. (Thanks!)
4:10pm
Presentation by Kevin Ham.
Roy, D.; K.S.R. Murty; and C.K. Mohan (2015),
Feature selection using deep neural networks,
2015 International Joint Conference on Neural Networks (IJCNN),
1--6.
4:40pm
Presentation by Yijun Wei.
LeCun, Yann; Yoshua Bengio; and Geoffrey Hinton (2015),
Deep learning,
Nature, 521, 436--444.
4:10pm
Presentation by Ibrahim Elhag.
Mone, G. (2013), Beyond Hadoop, Communications of the ACM ???
This finishes the first round of presentations.
4:40pm
Presentation by William Ampeh.
Tools for scientific computation.
Slides.
4:10pm
Presentation by John Leung.
Deep Learning with CUDA on GPU through the Python Theano Library.
4:40pm
Presentation by William Basinger.
Von Tscharner. M.; S.M. Schmalholz; and J.-L. Epard
(2016),
3-D Numerical Models of Viscous Flow Applied to Fold Nappes and the Rawil
Depression in the Helvetica Nappe System (Western Switzerland),
Journal of Structural Geology, 32--46.
4:10pm
Presentation by Redouane Betrouni.
4:25pm
Presentation by Kevin Ham.
Jia, Yangqing; Evan Shelhamer; Jeff Donahue; Sergey Karayev;
Jonathan Long; Ross Girshick; Sergio Guadarrama; and Trevor Darrell
(2014),
CAFFE: Convolutional Architecture for Fast Feature Embedding,
Proceedings of the 22nd ACM International Conference on Multimedia,
675--678
doi: 10.1145/2647868.2654889
Abstract.
4:50pm
Presentation by Yijun Wei.
Mnih, Volodymyr; Koray Kavukcuoglu; David Silver; Andrei A. Rusu; Joel Veness;
Marc G. Bellemare; Alex Graves; Martin Riedmiller; Andreas K. Fidjeland;
Georg Ostrovski; Stig Petersen; Charles Beattie; Amir Sadik; Ioannis Antonoglou;
Helen King; Dharshan Kumaran; Daan Wierstra; Shane Legg; and Demis Hassabis (2015),
Human-level control through deep reinforcement learning,
Nature, 518, 529--533.