|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectindex.Thesaurus
This will get the directory which contains the file and make the collection from them. Bulding Inveted index and Index of the documents.
Field Summary | |
private java.util.Map |
allFilesFingerprints
Index of the documents |
private FileID[] |
fileIDs
List of fileIDs. |
private java.io.File[] |
files
List of the files for obtaining information about files |
private MultiMap |
invertedIndex
Inverted index of the collection |
Constructor Summary | |
Thesaurus(java.lang.String path)
Make the thesaurus from the files in the specified directory |
Method Summary | |
private java.util.ArrayList |
extractAllGrams(java.io.File[] files)
Each file has corresponding entry in for the list of the fingerprints in the return value |
private FileID[] |
fillFileIDs()
Make another view of allFilesFingerprints in the form of FileID[] |
java.util.Map |
getAllFilesFingerprints()
|
java.util.List |
getDocuments(java.lang.Integer gram)
The list of documents which contain the specific gram |
java.util.List |
getDocumentTerms(java.lang.String document)
Getting the terms of the specified document name |
FileID[] |
getFileIDs()
|
private java.util.Map |
getFingerprints()
Make the index from the documents |
MultiMap |
getInvertedIndex()
|
int |
getNumberOfDocuments()
Getting the number of collection documents. |
int |
getNumberOfDocumentsContainingGram(java.lang.Integer gram)
Getting the number of the documents containing the specific gram(fingerprint) |
private MultiMap |
makeInvertedIndex()
Make the inverted index |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private MultiMap invertedIndex
private java.util.Map allFilesFingerprints
private java.io.File[] files
private FileID[] fileIDs
Constructor Detail |
public Thesaurus(java.lang.String path) throws DirectoryNotFound, FileTooShort
path
- the directory containing the files of the collection
DirectoryNotFound
- the directory must exist
FileTooShort
Method Detail |
public java.util.List getDocumentTerms(java.lang.String document)
document
- the name of the document
public int getNumberOfDocuments()
public int getNumberOfDocumentsContainingGram(java.lang.Integer gram)
gram
- the gram which number of its occurrences must be computed
public java.util.List getDocuments(java.lang.Integer gram)
gram
- the specified gram
private FileID[] fillFileIDs()
private MultiMap makeInvertedIndex()
private java.util.Map getFingerprints() throws FileTooShort
FileTooShort
private java.util.ArrayList extractAllGrams(java.io.File[] files) throws FileTooShort
files
- the files to obtain fingerprints
FileTooShort
public java.util.Map getAllFilesFingerprints()
public FileID[] getFileIDs()
public MultiMap getInvertedIndex()
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |