Codebase list libtext-levenshtein-damerau-perl / debian/latest
debian/latest

Tree @debian/latest (Download .tar.gz)

NAME
    Text::Levenshtein::Damerau - Damerau Levenshtein edit distance.

SYNOPSIS
            use Text::Levenshtein::Damerau;

            my @targets = ('fuor','xr','fourrrr','fo');

            # Initialize Text::Levenshtein::Damerau object with text to compare against
            my $tld = Text::Levenshtein::Damerau->new('four');

            print $tld->dld($targets[0]);
            # prints 1

            my $tld = $tld->dld({ list => \@targets });
            print $tld->{'fuor'};
            # prints 1

            print $tld->dld_best_match({ list => \@targets });
            # prints fuor

            print $tld->dld_best_distance({ list => \@targets });
            # prints 1


            # or even more simply
            use Text::Levenshtein::Damerau qw/edistance/;
        
            print edistance('Neil','Niel');
            # prints 1

DESCRIPTION
    Returns the true Damerau Levenshtein edit distance of strings with
    adjacent transpositions. Useful for fuzzy matching, DNA variation
    metrics, and fraud detection.

    Defaults to using Pure Perl Text::Levenshtein::Damerau::PP, but has an
    XS addon Text::Levenshtein::Damerau::XS for massive speed imrovements.
    Works correctly with utf8 if backend supports it; known to work with
    "Text::Levenshtein::Damerau::PP" and "Text::Levenshtein::Damerau::XS".

            use Text::Levenshtein::Damerau;
            use utf8;

            my $tld = Text::Levenshtein::Damerau->new('ⓕⓞⓤⓡ');
            print $tld->dld('ⓕⓤⓞⓡ');
            # prints 1

CONSTRUCTOR
  new
    Creates and returns a "Text::Levenshtein::Damerau" object. Takes a
    scalar with the text (source) you want to compare against.

            my $tld = Text::Levenshtein::Damerau->new('Neil');
            # Creates a new Text::Levenshtein::Damerau object $tld

METHODS
  $tld->dld
    Scalar Argument: Takes a string to compare with.

    Returns: an integer representing the edit distance between the source
    and the passed argument.

    Hashref Argument: Takes a hashref containing:

    *   list => \@array (array ref of strings to compare with)

    *   *OPTIONAL* max_distance => $int (only return results with $int
        distance or less).

    *   *OPTIONAL* backend => 'Some::Module::its_function' Any module that
        will take 2 arguments and returns an int. If the module fails to
        load, the function doesn't exist, or the function doesn't return a
        number when passed 2 strings, then "backend" remains unchanged.

                # Override defaults and use Text::Levenshtein::Damerau::PP's pp_edistance()
                $tld->dld({ list=> \@list, backend => 'Text::Levenshtein::Damerau::PP::pp_edistance');

                # Override defaults and use Text::Levenshtein::Damerau::XS's xs_edistance()
                use Text::Levenshtein::Damerau;
                requires Text::Levenshtein::Damerau::XS;
                ...
                $tld->dld({ list=> \@list, backend => 'Text::Levenshtein::Damerau::XS::xs_edistance');

    Returns: hashref with each word from the passed list as keys, and their
    edit distance (if less than max_distance, which is unlimited by
    default).

            my $tld = Text::Levenshtein::Damerau->new('Neil');
            print $tld->dld( 'Niel' );
            # prints 1

            #or if you want to check the distance of various items in a list

            my @names_list = ('Niel','Jack');
            my $tld = Text::Levenshtein::Damerau->new('Neil');
            my $d_ref = $tld->dld({ list=> \@names_list }); # pass a list, returns a hash ref
            print $d_ref->{'Niel'}; #prints 1
            print $d_ref->{'Jack'}; #prints 4

  $tld->dld_best_match
    Argument: an array reference of strings.

    Returns: the string with the smallest edit distance between the source
    and the array of strings passed.

    Takes distance of $tld source against every item in @targets, then
    returns the string of the best match.

            my $tld = Text::Levenshtein::Damerau->new('Neil');
            my @name_spellings = ('Niel','Neell','KNiel');
            print $tld->dld_best_match({ list=> \@name_spellings });
            # prints Niel

  $tld->dld_best_distance
    Arguments: an array reference of strings.

    Returns: the smallest edit distance between the source and the array
    reference of strings passed.

    Takes distance of $tld source against every item in the passed array,
    then returns the smallest edit distance.

            my $tld = Text::Levenshtein::Damerau->new('Neil');
            my @name_spellings = ('Niel','Neell','KNiel');
            print $tld->dld_best_distance({ list => \@name_spellings });
            # prints 1

EXPORTABLE METHODS
  edistance
    Arguments: source string and target string.

    *   *OPTIONAL 3rd argument* int (max distance; only return results with
        $int distance or less). 0 = unlimited. Default = 0.

    Returns: int that represents the edit distance between the two argument.
    -1 if max distance is set and reached.

    Wrapper function to take the edit distance between a source and target
    string. It will attempt to use, in order:

    *   Text::Levenshtein::Damerau::XS xs_edistance

    *   Text::Levenshtein::Damerau::PP pp_edistance

            use Text::Levenshtein::Damerau qw/edistance/;
            print edistance('Neil','Niel');
            # prints 1

SEE ALSO
    *   <https://github.com/ugexe/Text--Levenshtein--Damerau> *repository*

    *   <http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance>
        *damerau levenshtein explaination*

    *   Text::Fuzzy *regular levenshtein distance*

BUGS
    Please report bugs to:

    <https://rt.cpan.org/Public/Dist/Display.html?Name=Text-Levenshtein-Dame
    rau>

AUTHOR
    Nick Logan <ug@skunkds.com>

LICENSE AND COPYRIGHT
    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.