Anagram finder in Perl – Lars Nordenström's blog

This is the Perl implementation of the anagram program. The problem and the solution is described in the main anagram article. Do read that article first.

Also, you may find the C++ article helpful, since it describes member data and functions in some detail, not repeated here. This article describes only the differences between the C++ implementation and this implementation.

The Perl and C++ implementations are almost identical; the differences are mostly syntactical.

There is no separate Word class. All words are instantiated in the AnagramFinder constructor.

Note the block within the map operation in the AnagramFinder constructor. The last value in the block is the value of the block. The assignment my $value does not interfer with the generated list.

The source code

#!/usr/bin/perl
use strict;

my   $MINIMUM = 3;
my   $comparisons;

#---------------------------------------------------------------------------------------- AnagramFinder
package AnagramFinder;

sub new {
    my ($class, $text, $len) = @_;
    my $last = length($$text)-$len;

    my @words = map {
        my $value = substr($$text, $_, $len);
        {
            offset  => $_,
            value   => $value,
            unified => join('', sort(split(//, $value)))
        }
    } (0..$last);

    return bless {
        text     => $text,
        words    => \@words,
        len      => $len,
        accepted => undef,
        low_offset => 0,
    };
}

sub print_accepted($) {
    my ($this) = @_;
    my $accepted = $this->{'accepted'};
    printf("%3d:", scalar @$accepted);
    map { printf(" %4d:%s", $_->{'offset'}, $_->{'value'}) } @$accepted;
    print("\n");
}

sub add_candidate($$) {
    my ($this, $cand) = @_;
    my $cand_offset = $cand->{'offset'};
    my $cand_value  = $cand->{'value'};
    my $accepted    = $this->{'accepted'};
    return if ($cand_offset < $this->{'low_offset'});

    for my $it (@$accepted) {
        return if ($cand_value eq $it->{'value'});
    }
    push @{$this->{'accepted'}}, $cand;
    $this->{'low_offset'} = $cand_offset + $this->{'len'};
}

sub add_range($@) {
    my $this = shift;
    $this->{'accepted'} = [];
    $this->{'low_offset'} = 0;
    my $accepted = $this->{'accepted'};
    if (@_ >= $MINIMUM) {
        for my $q (@_) {
            $this->add_candidate($q);
        }
        $this->print_accepted() if (@$accepted >= $MINIMUM);
    }
}

sub find_all($) {
    my ($this) = @_;
    @{$this->{'words'}}   = sort { $comparisons++;
        $a->{'unified'} cmp $b->{'unified'} || $a->{'offset'} <=> $b->{'offset'}
    } (@{$this->{'words'}});

    my @r = ();
    my $prev;
    foreach my $p (@{$this->{'words'}}) {
        my $unified = $p->{'unified'};
        if ($prev && $unified ne $prev) {
            $this->add_range(@r);
            undef @r;
        }
        $prev = $unified;
        push @r, $p;
    }
    $this->add_range(@r);
    return;
}

#---------------------------------------------------------------------------------------- Main program

sub main() {
    my $from =  @ARGV ? (shift @ARGV) : 6;
    my $to   =  @ARGV ? (shift @ARGV) : $from;

    my $text = uc(join("", <>)); #--- Read whole file, convert to upper case
    $text    =~ s/[^A-Z]+//g;    #--- Remove everything except letters

    foreach my $l ($from..$to) {
        printf("%slen = %d\n", (($l > $from) ? "\n" : ""), $l);
        new AnagramFinder(\$text, $l)->find_all();
    }
    printf("string length = %5d, comparisons = %d\n", length($text), $comparisons);
}

main();

You can reach me by email at “lars dash 7 dot sdu dot se” or by telephone +46 705 189090

View source for the content of this page.

The source code

Cancel reply