Class ClusterBuilder

java.lang.Object
org.carrot2.attrs.AttrComposite
org.carrot2.clustering.lingo.ClusterBuilder
All Implemented Interfaces:
AcceptingVisitor

public class ClusterBuilder extends AttrComposite
Builds cluster labels based on the reduced term-document matrix and assigns documents to the labels.
  • Field Details

    • phraseLabelBoost

      public AttrDouble phraseLabelBoost
      Weight of multi-word labels relative to one-word labels. Low values will result in more one-word labels being produced, higher values will favor multi-word labels.
    • phraseLengthPenaltyStart

      public AttrInteger phraseLengthPenaltyStart
      Phrase length at which the overlong multi-word labels should start to be penalized. Phrases of length smaller than phraseLengthPenaltyStart will not be penalized.
    • phraseLengthPenaltyStop

      public AttrInteger phraseLengthPenaltyStop
      Phrase length at which the overlong multi-word labels should be removed completely. Phrases of length larger than phraseLengthPenaltyStop will be removed.
    • clusterMergingThreshold

      public AttrDouble clusterMergingThreshold
      Percentage of overlap between two cluster's document sets at which to merge the clusters. Low values will result in more aggressive merging, which may lead to irrelevant documents in clusters. High values will result in fewer clusters being merged, which may lead to very similar or duplicated clusters.
    • labelAssigner

      public LabelAssigner labelAssigner
      The method of assigning documents to labels when forming clusters.
  • Constructor Details

    • ClusterBuilder

      public ClusterBuilder()