Class SimilarityComputer


  • public final class SimilarityComputer
    extends Object
    This class is responsible for computing similarities between two text contents and deciding whether they are close enough to be considered a rename.
    • Field Detail

      • MINIMUM_LENGTH

        public static final int MINIMUM_LENGTH
        The minimum length both sides must have to not be ignored (short text contents might seem similar and cause false negatives).
        See Also:
        Constant Field Values
      • THRESHOLD

        public static final double THRESHOLD
        The maximum percentage of differing lines contained in the content for files to be considered a rename.
        See Also:
        Constant Field Values
    • Method Detail

      • computeDifference

        public static double computeDifference​(InputStream a,
                                               InputStream b)
                                        throws IOException
        Computes the difference between two InputStream instances. The returned value is a ratio of changed lines to total lines, where total lines is denoted by the maximum of the line counts of both input streams. This method returns Double.MAX_VALUE if one or both of the streams are null or if the content is too short to be compared (shorter than MINIMUM_LENGTH).
        Parameters:
        a - the first input stream
        b - the second input stream
        Returns:
        how different the two streams are
        Throws:
        IOException - if reading of one of the input streams fails