Languages
[Edit]
EN

Java - calculate Levenshtein distance between strings

7 points
Created by:
Maggotta
499

In this short article, we would like to show simple Java implementation for the Levenstein distance algorithm.

Levenstein distance algorithm is used to measure the difference between two sequences (e.g. between two strings).

When the algorithm returns 0 it means: compared objects are equal.

Simple implementation:

import java.util.Objects;

public class Program {

    private static int findMin(int a, int b, int c) {
        int min = Math.min(a, b);
        return Math.min(min, c);
    }

    private static int calculateLevenshteinDistance(String a, String b) {
        int aLimit = a.length() + 1;
        int bLimit = b.length() + 1;
        int[][] distance = new int[aLimit][];
        for (int i = 0; i < aLimit; ++i) {
            distance[i] = new int[bLimit];
        }
        for (int i = 0; i < aLimit; ++i) {
            distance[i][0] = i;
        }
        for (int j = 0; j < bLimit; ++j) {
            distance[0][j] = j;
        }
        for (int i = 1; i < aLimit; ++i) {
            for (int j = 1; j <  bLimit; ++j) {
                char aChar = a.charAt(i - 1);
                char bChar = b.charAt(j - 1);
                distance[i][j] = findMin(
                    distance[i - 1][j] + 1,
                    distance[i][j - 1] + 1,
                    distance[i - 1][j - 1] + (Objects.equals(aChar, bChar) ? 0 : 1) // + substitution cost
                );
            }
        }
        return distance[a.length()][b.length()];
    };


    // Usage example:

    public static void main(String[] args) {

        System.out.println(calculateLevenshteinDistance("Chris",  "Chris"));  // 0
        System.out.println(calculateLevenshteinDistance("John1",  "John2"));  // 1
        System.out.println(calculateLevenshteinDistance("Google", "Gogle"));  // 1
        System.out.println(calculateLevenshteinDistance("Ann",    "Matt" ));  // 4

        System.out.println(calculateLevenshteinDistance("CHRIS",  "Chris"));  // 4
    }
}

Output:

0
1
1
4
4

 

Levenstein distance algorithm with case-insensitive

It is necessary to wrap the existing algorithm with toLowerCase() or toUpperCase() string transformation.

import java.util.Objects;

public class Program {

    private static int findMin(int a, int b, int c) {
        int min = Math.min(a, b);
        return Math.min(min, c);
    }

    private static int calculateLevenshteinDistance(String a, String b) {
        int aLimit = a.length() + 1;
        int bLimit = b.length() + 1;
        int[][] distance = new int[aLimit][];
        for (int i = 0; i < aLimit; ++i) {
            distance[i] = new int[bLimit];
        }
        for (int i = 0; i < aLimit; ++i) {
            distance[i][0] = i;
        }
        for (int j = 0; j < bLimit; ++j) {
            distance[0][j] = j;
        }
        for (int i = 1; i < aLimit; ++i) {
            for (int j = 1; j <  bLimit; ++j) {
                char aChar = a.charAt(i - 1);
                char bChar = b.charAt(j - 1);
                distance[i][j] = findMin(
                    distance[i - 1][j] + 1,
                    distance[i][j - 1] + 1,
                    distance[i - 1][j - 1] + (Objects.equals(aChar, bChar) ? 0 : 1) // + substitution cost
                );
            }
        }
        return distance[a.length()][b.length()];
    };

    private static int calculateImprovedLevenshteinDistance(String a, String b) {
        return calculateLevenshteinDistance (a.toLowerCase(), b.toLowerCase());
    };


    // Usage example:

    public static void main(String[] args) {

        System.out.println(calculateImprovedLevenshteinDistance("CHRIS",  "Chris"));  // 0
        System.out.println(calculateImprovedLevenshteinDistance("JOHN1",  "John2"));  // 1
        System.out.println(calculateImprovedLevenshteinDistance("GOOGLE", "Gogle"));  // 1
        System.out.println(calculateImprovedLevenshteinDistance("ANN",    "Matt" ));  // 3
    }
}

Output:

0
1
1
3

See also

  1. Java - Math.min() with multiple arguments

References

  1. Levenshtein distance - Wikipedia
Donate to Dirask
Our content is created by volunteers - like Wikipedia. If you think, the things we do are good, donate us. Thanks!
Join to our subscribers to be up to date with content, news and offers.

Java - string metrics algorithms

Native Advertising
🚀
Get your tech brand or product in front of software developers.
For more information Contact us
Dirask - we help you to
solve coding problems.
Ask question.

❤️💻 🙂

Join