Package org.carrot2.text.vsm
Class VectorSpaceModelContext
java.lang.Object
org.carrot2.text.vsm.VectorSpaceModelContext
Stores data related to the Vector Space Model of the processed documents.
-
Field Summary
FieldsModifier and TypeFieldDescriptionfinal PreprocessingContext
Preprocessing context for the underlying documents.com.carrotsearch.hppc.IntIntHashMap
Stem index to row index mapping for thetdMatrix
.org.carrot2.math.mahout.matrix.DoubleMatrix2D
Term-document matrix.org.carrot2.math.mahout.matrix.DoubleMatrix2D
Term-document-like matrix for phrases fromPreprocessingContext.AllLabels
. -
Constructor Summary
ConstructorsConstructorDescriptionVectorSpaceModelContext
(PreprocessingContext preprocessingContext) Creates a vector space model context with the provided preprocessing context. -
Method Summary
-
Field Details
-
preprocessingContext
Preprocessing context for the underlying documents. -
termDocumentMatrix
public org.carrot2.math.mahout.matrix.DoubleMatrix2D termDocumentMatrixTerm-document matrix. Rows of the matrix correspond to word stems, columns correspond to the processed documents. For mapping between rows of this matrix andPreprocessingContext.AllStems
, seestemToRowIndex
.This matrix is produced by
TermDocumentMatrixBuilder.buildTermDocumentMatrix(VectorSpaceModelContext)
. -
termPhraseMatrix
public org.carrot2.math.mahout.matrix.DoubleMatrix2D termPhraseMatrixTerm-document-like matrix for phrases fromPreprocessingContext.AllLabels
. If there are no phrases inPreprocessingContext.AllLabels
, phrase matrix isnull
. For mapping between rows of this matrix andPreprocessingContext.AllStems
, seestemToRowIndex
.This matrix is produced by
TermDocumentMatrixBuilder.buildTermPhraseMatrix(VectorSpaceModelContext)
. -
stemToRowIndex
public com.carrotsearch.hppc.IntIntHashMap stemToRowIndexStem index to row index mapping for thetdMatrix
. Keys in this map are indices of entries inPreprocessingContext.AllStems
arrays, values are the indices oftdMatrix
rows corresponding to the stems. Please note that depending on the limit on the size of the matrix, some stems may not have their corresponding matrix rows.This object is produced by
TermDocumentMatrixBuilder.buildTermDocumentMatrix(VectorSpaceModelContext)
.
-
-
Constructor Details
-
VectorSpaceModelContext
Creates a vector space model context with the provided preprocessing context.
-