Codebase list libemail-find-perl / HEAD
HEAD

Tree @HEAD (Download .tar.gz)

NAME
    Email::Find - Find RFC 822 email addresses in plain text

SYNOPSIS
      use Email::Find;

      # new object oriented interface
      my $finder = Email::Find->new(\&callback);
      my $num_found - $finder->find(\$text);

      # good old functional style
      $num_found = find_emails($text, \&callback);

DESCRIPTION
    Email::Find is a module for finding a *subset* of RFC 822 email
    addresses in arbitrary text (see the section on "CAVEATS"). The
    addresses it finds are not guaranteed to exist or even actually be email
    addresses at all (see the section on "CAVEATS"), but they will be valid
    RFC 822 syntax.

    Email::Find will perform some heuristics to avoid some of the more
    obvious red herrings and false addresses, but there's only so much which
    can be done without a human.

METHODS
    new
          $finder = Email::Find->new(\&callback);

        Constructs new Email::Find object. Specified callback will be called
        with each email as they're found.

    find
          $num_emails_found = $finder->find(\$text);

        Finds email addresses in the text and executes callback registered.

        The callback is given two arguments. The first is a Mail::Address
        object representing the address found. The second is the actual
        original email as found in the text. Whatever the callback returns
        will replace the original text.

FUNCTIONS
        For backward compatibility, Email::Find exports one function,
        find_emails(). It works very similar to URI::Find's find_uris().

EXAMPLES
          use Email::Find;

          # Simply print out all the addresses found leaving the text undisturbed.
          my $finder = Email::Find->new(sub {
                                            my($email, $orig_email) = @_;
                                            print "Found ".$email->format."\n";
                                            return $orig_email;
                                        });
          $finder->find(\$text);

          # For each email found, ping its host to see if its alive.
          require Net::Ping;
          $ping = Net::Ping->new;
          my %Pinged = ();
          my $finder = Email::Find->new(sub {
                                            my($email, $orig_email) = @_;
                                            my $host = $email->host;
                                            next if exists $Pinged{$host};
                                            $Pinged{$host} = $ping->ping($host);
                                        });

          $finder->find(\$text);

          while( my($host, $up) = each %Pinged ) {
              print "$host is ". $up ? 'up' : 'down' ."\n";
          }

          # Count how many addresses are found.
          my $finder = Email::Find->new(sub { $_[1] });
          print "Found ", $finder->find(\$text), " addresses\n";

          # Wrap each address in an HTML mailto link.
          my $finder = Email::Find->new(
              sub {
                  my($email, $orig_email) = @_;
                  my($address) = $email->format;
                  return qq|<a href="mailto:$address">$orig_email</a>|;
              },
          );
          $finder->find(\$text);

SUBCLASSING
        If you want to change the way this module works in finding email
        address, you can do it by making your subclass of Email::Find, which
        overrides "addr_regex" and "do_validate" method.

        For example, the following class can additionally find email
        addresses with dot before at mark. This is illegal in RFC822, see
        the Email::Valid::Loose manpage for details.

          package Email::Find::Loose;
          use base qw(Email::Find);
          use Email::Valid::Loose;

          # should return regex, which Email::Find will use in finding
          # strings which are "thought to be" email addresses
          sub addr_regex {
              return $Email::Valid::Loose::Addr_spec_re;
          }

          # should validate $addr is a valid email or not.
          # if so, return the address as a string.
          # else, return undef
          sub do_validate {
              my($self, $addr) = @_;
              return Email::Valid::Loose->address($addr);
          }

        Let's see another example, which validates if the address is an
        existent one or not, with Mail::CheckUser module.

          package Email::Find::Existent;
          use base qw(Email::Find);
          use Mail::CheckUser qw(check_email);

          sub do_validate {
              my($self, $addr) = @_;
              return check_email($addr) ? $addr : undef;
          }

CAVEATS
        Why a subset of RFC 822?
            I say that this module finds a *subset* of RFC 822 because if I
            attempted to look for *all* possible valid RFC 822 addresses I'd
            wind up practically matching the entire block of text! The
            complete specification is so wide open that its difficult to
            construct soemthing that's *not* an RFC 822 address.

            To keep myself sane, I look for the 'address spec' or 'global
            address' part of an RFC 822 address. This is the part which most
            people consider to be an email address (the 'foo@bar.com' part)
            and it is also the part which contains the information necessary
            for delivery.

        Why are some of the matches not email addresses?
            Alas, many things which aren't email addresses *look* like email
            addresses and parse just fine as them. The biggest headache is
            email and usenet and email message IDs. I do my best to avoid
            them, but there's only so much cleverness you can pack into one
            library.

AUTHORS
        Copyright 2000, 2001 Michael G Schwern <schwern@pobox.com>. All
        rights reserved.

        Current maintainer is Tatsuhiko Miyagawa <miyagawa@bulknews.net>.

THANKS
        Schwern thanks to Jeremy Howard for his patch to make it work under
        5.005.

LICENSE
        This module is free software; you may redistribute it and/or modify
        it under the same terms as Perl itself.

        The author STRONGLY SUGGESTS that this module not be used for the
        purposes of sending unsolicited email (ie. spamming) in any way,
        shape or form or for the purposes of generating lists for commercial
        sale.

        If you use this module for spamming I reserve the right to make fun
        of you.

SEE ALSO
        the Email::Valid manpage, RFC 822, the URI::Find manpage, the
        Apache::AntiSpam manpage, the Email::Valid::Loose manpage