Anagram Finder in Ruby

This is the Ruby implementation of the anagram program. The problem and the solution is described in the main anagram article. Do read that article first.

Also, you may find the C++ article helpful, since it describes member data and functions in some detail, not repeated here. This article describes only the differences between the C++ implementation and this implementation.

The Ruby implementation is similar to the other scripting implementations; the differences are mostly syntactical.

Comparing words is implemented is the comparison (<=>) operator in Word. That operator is used implicitly by the array sort method.

The Ruby program is currently the shortest, 58 significant lines to Perl’s 70. It is also the slowest, using 50 per cent more time than Perl, and 150-200 per cent more than Python and PHP.

The source code

#!/usr/bin/env ruby
MINIMUM = 3
$comparisons = 0

#---------------------------------------------------------------------------------------- Word

class Word
  def initialize(offset, value, unified)
    @offset  = offset.to_i
    @value   = value
    @unified = unified
  end

  def <=>(b)
    $comparisons = $comparisons+1
    tmp = unified <=> b.unified
    return tmp != 0 ? tmp : offset <=> b.offset
  end

  attr_reader :offset, :value, :unified
end

#---------------------------------------------------------------------------------------- AnagramFinder

class AnagramFinder
  def initialize(text, len)
    last      = text.length-len
    @text     = text
    @len      = len
    @words    = []
    @accepted = []
    @low_offset = 0
    @words    = (0..last).to_a.collect {
      |i| value = @text[i, @len]
      Word.new(i, value, get_unified(value))
    }
  end

  def get_unified(value)
    return value.split(//).sort!.join("")
  end

  def print_accepted()
    print("%3d:" % [ @accepted.length ])
    @accepted.each { |it| print(" %4d:%s" % [0+it.offset, it.value]) }
    print("\n")
  end

  def add_candidate(cand)
    return if cand.offset < @low_offset                       #--- Overlapping
    @accepted.each { |it| return if cand.value == it.value }  #--- Duplicates
    @accepted.push(cand)
    @low_offset = cand.offset + @len
  end

  def add_range(first, last)
    if (last - first + 1 >= MINIMUM)
      @accepted = []
      @low_offset = 0
      (first..last).each { |n| add_candidate(@words[n]) }
      print_accepted if (@accepted.length >= MINIMUM)
    end
  end

  def find_all()
    @words.sort!
    first = 0
    for n in (1..(@words.length-1))
      if (@words[n].unified != @words[first].unified)
        add_range(first, n-1)
        first = n
      end
    end
    add_range(first, n-1)  #--- n is still in scope
    GC.start  #--- Nice try, but makes no difference (neither time nor memory)
  end
end

#---------------------------------------------------------------------------------------- Main program

def main
  text = $stdin.read.upcase!.gsub!(/[^A-Z]+/, '')
  from = (ARGV.length > 0) ? ARGV[0].to_i : 6;
  to   = (ARGV.length > 1) ? ARGV[1].to_i : from;
  for len in (from..to)
    print("%slen = %d\n" % [ ((len > from) ? "\n" : ""), len ])
    AnagramFinder.new(text, len).find_all
  end
  print("string length = %5d, comparisons = %d\n" % [ text.length, $comparisons ])
end

main

You can reach me by email at “lars dash 7 dot sdu dot se” or by telephone +46 705 189090

View source for the content of this page.