CSI 991

Seminar in Computational Statistics:

Recent articles in computational learning and computational statistics

Spring, 2015

Fridays 3:00pm -- 5:30pm
Innovation Hall 338

CSI 991 Section 001

Contact:
jgentle@gmu.edu


The seminar will be conducted in the form of a journal club.

The objective is three-fold:

Students who are enrolled in the class will be required to make two 30-to-45-minute presentations, each summarizing a recent article in computational learning or computational statistics. ("Recent" means 2005 or later.)

A brief written report is also required. This can be in the form of straight text (5 or 6 pages) or a copy of the presentation slides.

The presentations can be simple summaries, or, preferably, critical reviews citing other work or possible approaches. Monte Carlo studies or applications on sample datasets would be nice.

Make sure that work that is supposed to be yours is indeed your own.

With cut-and-paste capabilities on webpages, it is easy to plagiarize.
Sometimes it is even accidental, because it results from legitimate note-taking; nevertheless, it is plagiarism and it is illegal.
Whenever you include a picture, graphic, or text from another source, give a clear citation of the previous work.

Scientific Literature

The advancement of science depends on the dissemination of the results of research so that others can build on that research. Research results are disseminated in various ways including personal communication, webpages, formal education in a classroom or individualized instruction, seminars, presentations at organized conferences and/or published proceedings of presentations at those conferences, books, journals, and magazines.

Publication of the results of research is generally motivated by personal ambition, rather than by desire to advance science. Personal ambition may include just the satisfaction of becoming better-known within the scientific community and the advancement in one's field of employment. These factors contribute to the plethora of publications. The plethora results in standards for evaluating the merits of different venues of publication.

Different fields of science place differing values on various types of publications. In the mathematics and the legacy sciences, "archival" journals have always been at the top. In computer science and various related areas, conference proceedings have risen to the top. Within the class of journals, "impact factors" or various other measures serve to order the journals within a given field. Conferences are often ranked based on "acceptance rate", in which the numerator is the number of papers presented and the denominator is some number determined by the conference organizers, ostensibly related to the number of papers that were submitted for possible presentation.

References to the Scientific Literature

The standard form of a bibliographic entry varies from one publication venue to another. At the very least a bibliographic entry must include the name(s) of the author(s), the date of publication, the title of the work, and whatever else is necessary for someone to locate the work.

The form of the bibliographic entry usually depends on the type of publication being referenced, whether it is a book or other type of stand-alone publication, an article in a journal, an article in an edited book, an article in a conference proceedings, a webpage, or some other type of publication.

Webpages and some other types of publication require an additional bit of information: the time when it was accessed (usually just the date of access). This is necessary because the authors of webpages can change the contents without changing the URL.

In a book, the author usually can choose the form to use. In conference proceedings and journals, the organizers or the editors may allow the individual articles to use different forms of bibliographic entries or they may require all articles to use some specified form. Some people feel that use of abbreviations in authors' first names and in the name of the journal raises the level of scholarship.

Use of DOI

A digital object identifier (DOI) is a character string that uniquely identifies an electronic file. The file may be a document, an image, a video, or some other type of object. For our purposes in this seminar, it is most likely an article or a book.

The character string is divided into two parts, separated by a forward slash. The first part of the string indicates the organization that registered the file, and the second part is the unique string assigned to the object. For example, in the DOI 10.1080/0952813X.2010.505800 the first part indicates the Taylor & Francis Group, which is a large publishing house that publishes a number of journals (Including Journal of the American Statistical Association) and books. The second part indicates a specific article. The format of the second part is decided by the registering organization.

A registry is maintained by a consortium organization, called the International DOI Foundation, that maintains information about where the object is stored on the internet. To find the document in the example DOI above, use
https://dx.doi.org/10.1080/0952813X.2010.505800
This takes you to a site that is devoted to that specific document. That site is "free", but to access the document, some privilege, such as a subscription or direct payment, may be required. In some cases, your GMU credentials may take you to the document. In other cases, even if GMU has full access rights, you can't get there from here; you have to go to a GMU site and proceed to the location of the document.

Examples of References

You can find examples of different forms of bibliographic entries in the references sections of books and journal articles. For your assignments, please use a "standard" bibliographic format. (There are lots of "standards".)

Here are some examples of the form that I prefer for various type of entries from the oldest sources I could find.

Note the general arrangement of the various fields, the various punctuation marks used, the use of exact names as in the published source, and so on. This is the style I use. You can use whatever you want so long as it conforms to the rules of whoever publishes it and so long as it contains all of the relevant fields.

Citations to the Scientific Literature

Within the text of a document, we make citations to entries in the bibliography. In different areas of science and/or in different journals, there are different standard ways of citing the entries in the bibliography. A simple way is just to number the entries sequentially and use the number, usually enclosed in square brackets, to reference them in the text. The entries in the bibliography may be ordered alphabetically by the last name of the first author or be the order that they are cited in the document.

Another type of citation that is very popular in computer science and various related areas is to assign each bibliography entry a hash code of numerals and letters based on the author(s) name(s), the date, and/or other parts of the bibliographic entry.

Another simple type of citation is the author(s) last name(s) and the date of publication, possibly with an appended "a", "b", etc., enclosed in parentheses. This is the method I prefer and use, because when reading the article, if I'm somewhat familiar with the literature, the citation tells me what the reference is without my having to go to the bibliography.

References and Citations in TeX.

I strongly encourage the use of TeX in scientific writing.

There is a nice package in TeX for producing a bibliography called bibtex. There are other nice packages that tie in with bibtex for making citations. I use these packages most of the time.

Accessing the Scientific Literature at GMU

Find online articles at http://library.gmu.edu/
Click on "E-Journals" and then enter keyword such as
"machine learning", "statistical learning", "computational statistics", etc. or else enter the name of a specific journal.

Schedule