This is the Fortran 77 implementation of the anagram program. The problem and the solution is described in the main anagram article. Do read that article first.
Don’t write new programs in Fortran 77. No-one else does. But by all means, do write new programs in later versions of Fortran, especially if you are into high-performance computing, and area where Fortran shines.
This implementation is meant to follow the Fortran 77 standard. Since I do not have a true Fortran 77 compiler, I have used GNU gfortran with -std=legacy. But legacy is not Fortran 77. It does not accept ELSE IF, which is in the Fortran 77 standard, but it does accept other things that are not in the standard. To help me keep to the straight and narrow, I have used my old F77 textbook and some Fortran 77 manuals.
- DEC: PDP-11 FORTRAN-77 Language Reference Manual (July 1983)
- SGI: FORTRAN 77 Language Reference Manual (1990)
- IBM: VS FORTRAN Version 2, Language and Library Reference, Release 6 (Nov 1993)
The code should be compatible with all versions of Fortran up to but not including Fortran 2018. Fortran 2018 deletes (yes, that is the word they use) COMMON blocks from the language, making Fortran 2018 incompatible with Fortran 77.
I have taken some liberty with the standard. Neither of the following is in the standard, but implemented by many compilers:
- The exclamation mark (!) as a comment character.
- The INCLUDE statement.
There is no built-in sorting in Fortran. Sorting in this program is implemented using heapsort. In a language like Fortran 77, that does not provide recursion, heapsort is much easier to implement tham recursive algorithms like quicksort. Another advantage of heapsort is that the execution time is always n log n, while quicksort can be n squared for some input.
Commented source code
Shared constants
integer SHORTEST, LONGEST, TEXT_SIZE, ACC_SIZE, ACC_MINIMUM, A, Z parameter (SHORTEST = 3 ) ! At least this many characters in an anagram parameter (LONGEST = 10 ) ! At most - " - parameter (TEXT_SIZE = 500*512 ) ! Maximum size of the letter string parameter (ACC_SIZE = 50 ) ! Max # of accepted entries. parameter (ACC_MINIMUM = 3 ) ! Fewer than this does not count parameter (A = ichar('A')) ! Character code for 'A' parameter (Z = ichar('Z')) ! Character code for 'Z'
Shared variables
text holds the uppercased text containing only letters. unified holds the array of unified strings. unified(n) is the unified version of the string of length len starting at text(n) words holds indices into unified. These indices are also indices into text. unified holds the indices for accepted candidates with the same unified value, and naccepted counts them. naccepted is reset to zero every time a new unified value is found. comparisons counts the number of calls to compare_words.
All shared variables are put in a single named COMMON block. Fortran 77 has no other way of sharing variables, except by passing them as parameters, which quickly becomes tedious.
character*(TEXT_SIZE) text character*(LONGEST) unified(1:TEXT_SIZE-SHORTEST+1 ) integer words (1:TEXT_SIZE-SHORTEST+1+1) integer len integer nwords integer accepted(1:ACC_SIZE) integer naccepted integer comparisons common /globals/words,unified,text,len,nwords,naccepted,accepted, * comparisons
read_text(): Read the input file
Reads at most 1 million (effectively an infinite number of) lines from the standard input file. Discard all characters that are not letters. Convert lowercase letters to uppercase. Return the number of characters read.
integer function read_text() include "anagram-f77-common.i" character*(80) line integer put, i, n, ch character*(256) map C-- Assumes fixed format, 72 chars per line data map/'-------------------------------------------------------- *--------ABCDEFGHIJKLMNOPQRSTUVWXYZ------ABCDEFGHIJKLMNOPQRSTUVWXYZ *------------------------------------------------------------------ *------------------------------------------------------------------ *--'/ put = 1 do i = 1, 1000000, 1 read (*,100,end=999) line 100 format(A80) do n = 1, 80, 1 ch = ichar(line(n:n)) if (map(ch:ch) .ne. '-') then if (put > TEXT_SIZE) stop 'Input text too large' text(put:put) = map(ch:ch) put = put+1 endif enddo enddo 999 read_text = put-1 end
get_unified()
Compute the unified version of the string of len characters starting at text(pos) To avoid sorting, count the number of times each letter occurs. The statement lcount = 0 sets all elements to zero.
function get_unified(pos) include "anagram-f77-common.i" character*(LONGEST) get_unified integer pos integer lcount(A:Z), n, put, tmp !!! Count each letter lcount = 0 do n = pos, pos+len-1 tmp = ichar(text(n:n)) lcount(tmp) = lcount(tmp) + 1 enddo !!! Build key from letter counts put = 1 get_unified = '' do n = A, Z do tmp = 1, lcount(n) get_unified(put:put) = char(n) put = put + 1 enddo enddo end
compare_words()
First compare the unified versions of the strings. If they are equal, then compare their positions in the text.
integer function compare_words(i1, i2) include "anagram-f77-common.i" integer i1, i2 integer pos1, pos2 pos1 = words(i1) pos2 = words(i2) comparisons = comparisons + 1 compare_words = pos1 - pos2 if (unified(pos1) .gt. unified(pos2)) then compare_words = +1 else if (unified(pos1) .lt. unified(pos2)) then compare_words = -1 endif endif end
function add_to_heap()
The subroutine is called with the size of the heap before insertion as its first parameter, and a “pointer” (in the form of an index into the array unified as it second argument. The new value is placed at the new bootom of the heap and then bubbled up. Unnecessary word swap are avoided by not actually moving the new word until its final position is known.
subroutine add_to_heap(size, value) include "anagram-f77-common.i" integer size, value integer here, compare_words, cmp words(size+1) = value here = size 100 if (here .gt. 1) then ! while (here > 1) cmp = compare_words(size+1, here/2) if (cmp .gt. 0) then words(here) = words(here/2) here = here / 2 goto 100 endif endif words(here) = words(size+1) end
subroutine heap_sort()
The heap is already a heap, so this is the second phase of a heap sort. Save the value at the top of the heap in a safe place, w In thought, but not in deed, Move the last element to the top of the heap. Then, still in thought, let it trickle down until its rightful place has been found. Values bubble up to leave space for the element. Then, in statement 199, move, in deed, the last element to its rightful place. Finally, in the next statement, move the element that was at the top to the head of the sorted list.
Repeat until the heap has size 1.
subroutine heap_sort() include "anagram-f77-common.i" integer c12, empty, child, i, j, w, compare_words integer hsize do i = nwords, 2, -1 w = words(1) hsize = i - 1 empty = 1 do j = 1, 10000000 if (2*empty .gt. hsize) goto 199 c12 = compare_words(i, 2*empty ) if (2*empty .eq. hsize) then ! A single (left) child if (c12 .lt. 0) words(empty) = words(2*empty) goto 199 ! F77 does not have the EXIT statement else if (c12 .gt. 0) then if (compare_words(i, 2*empty+1) .gt. 0) goto 199 ! Two children, both lesser endif child = 2*empty if (compare_words(child, child+1) .lt. 0) child = child+1 ! Right child is greater words(empty) = words(child) empty = child endif enddo 199 words(empty) = words(i) words(i) = w enddo end
subroutine add_candidate()
Add the candidate pointed to by the integer ptr (an index), but only if it does not overlap any existing candidates, and only if it is not a duplicate.
The candidate is an index into text and at the same time an index into unified. Candidates with the same unified value are added in increasing index order, so the overlap test need only look at the most recently accepted candidate.
subroutine add_candidate(ptr) include "anagram-f77-common.i" integer ptr,i, p2 if (naccepted .gt. 0) then if (ptr .le. accepted(naccepted)+(len-1)) then return endif endif do i = 1, naccepted p2 = accepted(i) if (text(ptr:ptr+len-1) .eq. text(p2:p2+len-1)) then return endif enddo naccepted = naccepted + 1 accepted(naccepted) = ptr end
subroutine print_result()
subroutine print_result() include "anagram-f77-common.i" integer i, ptr write(*,100, advance='no') naccepted 100 format(I3, ': ') do i = 1, naccepted ptr = accepted(i) write(*,110, advance='no') ptr-1, text(ptr:ptr+len-1) 110 format(I5, ': ', A) enddo write(*,*) end
subroutine find_all(text_len)
Find all anagrams of length len, already set by the main program.text_len is the number of characters in text.For each possible word, compute the unified value and add it to the heap words, then sort the heap. The words in words are now sorted by, and therefore grouped by, their unified value, and words with equal unified value are sorted by their index into text. For each word, call add_candidate. If the current word is the last in a suite of words with the same unifoed value 1) print the result if the number af accepted words reaches the threshold MINIMUM, and 2) clear the accepted list by setting naccepted to zero.
Note the use of a sentinel to simplify the testing for “last in a suite”.
subroutine find_all(text_len) include "anagram-f77-common.i" integer i integer text_len character*(LONGEST) get_unified nwords = text_len-len+1 do i=1, nwords ! call get_unified(i, unified (i)) unified(i) = get_unified(i) call add_to_heap(i, i) enddo call heap_sort() unified(nwords+1) = 'ZZZZZZZZZZ' words(nwords+1) = nwords + 1 do i=1, nwords call add_candidate(words(i)) if (unified(words(i)) .ne. unified(words(i+1))) then if (naccepted .ge. ACC_MINIMUM) then call print_result() endif naccepted = 0 endif enddo end
program main
Read the input text into text, then call find_all for all lengths len in the range from to to inclusive.
F77 does not have routines for accessing the command line, so the range is set by a DATA statement.
program main include "anagram-f77-common.i" integer text_len integer read_text integer from, to data from,to/14,7/ comparisons = 0 if (to .lt. from ) stop '*** "from" > "to"' if (to .gt. LONGEST ) stop '*** "to" is too large' if (from .lt. SHORTEST) stop '*** "from" is too small' text_len = read_text() do len = from, to if (len .gt. from) write(*,*) write(*,100) 'len = ', len 100 format(A, I0) call find_all(text_len) enddo write(*, 110) text_len, comparisons 110 format("string length = ", I5, ", comparisons = ", I5) end
Full source code
Parameters and COMMON block (anagram-f77-common.i)
!!!---* -*- mode: fortran -*- !------------------------------------------------------------------ !!!--- Compile-time constants integer SHORTEST, LONGEST, TEXT_SIZE, ACC_SIZE, ACC_MINIMUM, A, Z parameter (SHORTEST = 3 ) ! At least this many characters in an anagram parameter (LONGEST = 10 ) ! At most - " - parameter (TEXT_SIZE = 500*512 ) ! Maximum size of the letter string parameter (ACC_SIZE = 50 ) ! Max # of accepted entries. parameter (ACC_MINIMUM = 3 ) ! Fewer than this does not count parameter (A = ichar('A')) ! Character code for 'A' parameter (Z = ichar('Z')) ! Character code for 'Z' !!!---------------------------------------------------------------- !!!--- Global variables character*(TEXT_SIZE) text character*(LONGEST) unified(1:TEXT_SIZE-SHORTEST+1 ) integer words (1:TEXT_SIZE-SHORTEST+1+1) integer len integer nwords integer accepted(1:ACC_SIZE) integer naccepted integer comparisons common /globals/words,unified,text,len,nwords,naccepted,accepted, * comparisons
The program (anagram-f77.f)
!!!----------------------------------------------------------- read_text integer function read_text() include "anagram-f77-common.i" character*(80) line integer put, i, n, ch character*(256) map C-- Assumes fixed format, 72 chars per line data map/'-------------------------------------------------------- *--------ABCDEFGHIJKLMNOPQRSTUVWXYZ------ABCDEFGHIJKLMNOPQRSTUVWXYZ *------------------------------------------------------------------ *------------------------------------------------------------------ *--'/ put = 1 do i = 1, 1000000, 1 read (*,100,end=999) line 100 format(A80) do n = 1, 80, 1 ch = ichar(line(n:n)) if (map(ch:ch) .ne. '-') then if (put > TEXT_SIZE) stop 'Input text too large' text(put:put) = map(ch:ch) put = put+1 endif enddo enddo 999 read_text = put-1 end !!!--------------------------------------------------------- get_unified function get_unified(pos) include "anagram-f77-common.i" character*(LONGEST) get_unified integer pos integer lcount(A:Z), n, put, tmp !!! Count each letter lcount = 0 do n = pos, pos+len-1 tmp = ichar(text(n:n)) lcount(tmp) = lcount(tmp) + 1 enddo !!! Build key from letter counts put = 1 get_unified = '' do n = A, Z do tmp = 1, lcount(n) get_unified(put:put) = char(n) put = put + 1 enddo enddo end !!!------------------------------------------------------- compare_words integer function compare_words(i1, i2) include "anagram-f77-common.i" integer i1, i2 integer pos1, pos2 pos1 = words(i1) pos2 = words(i2) comparisons = comparisons + 1 compare_words = pos1 - pos2 if (unified(pos1) .gt. unified(pos2)) then compare_words = +1 else if (unified(pos1) .lt. unified(pos2)) then compare_words = -1 endif endif end !!!--------------------------------------------------------- add_to_heap subroutine add_to_heap(size, value) include "anagram-f77-common.i" integer size, value integer here, compare_words, cmp words(size+1) = value here = size 100 if (here .gt. 1) then ! while (here > 1) cmp = compare_words(size+1, here/2) if (cmp .gt. 0) then words(here) = words(here/2) here = here / 2 goto 100 endif endif words(here) = words(size+1) end !!!----------------------------------------------------------- heap_sort subroutine heap_sort() include "anagram-f77-common.i" integer c12, empty, child, i, j, w, compare_words integer hsize do i = nwords, 2, -1 w = words(1) hsize = i - 1 empty = 1 do j = 1, 10000000 if (2*empty .gt. hsize) goto 199 c12 = compare_words(i, 2*empty ) if (2*empty .eq. hsize) then ! A single (left) child if (c12 .lt. 0) words(empty) = words(2*empty) goto 199 ! F77 does not have the EXIT statement else if (c12 .gt. 0) then if (compare_words(i, 2*empty+1) .gt. 0) goto 199 ! Two children, both lesser endif child = 2*empty if (compare_words(child, child+1) .lt. 0) child = child+1 ! Right child is greater words(empty) = words(child) empty = child endif enddo 199 words(empty) = words(i) words(i) = w enddo end !!!------------------------------------------------------- add_candidate subroutine add_candidate(ptr) include "anagram-f77-common.i" integer ptr,i, p2 if (naccepted .gt. 0) then if (ptr .le. accepted(naccepted)+(len-1)) then return endif endif do i = 1, naccepted p2 = accepted(i) if (text(ptr:ptr+len-1) .eq. text(p2:p2+len-1)) then return endif enddo naccepted = naccepted + 1 accepted(naccepted) = ptr end !!!-------------------------------------------------------- print_result subroutine print_result() include "anagram-f77-common.i" integer i, ptr write(*,100, advance='no') naccepted 100 format(I3, ': ') do i = 1, naccepted ptr = accepted(i) write(*,110, advance='no') ptr-1, text(ptr:ptr+len-1) 110 format(I5, ': ', A) enddo write(*,*) end !!!------------------------------------------------------------ find_all subroutine find_all(text_len) include "anagram-f77-common.i" integer i integer text_len character*(LONGEST) get_unified nwords = text_len-len+1 do i=1, nwords ! call get_unified(i, unified (i)) unified(i) = get_unified(i) call add_to_heap(i, i) enddo call heap_sort() unified(nwords+1) = 'ZZZZZZZZZZ' words(nwords+1) = nwords + 1 do i=1, nwords call add_candidate(words(i)) if (unified(words(i)) .ne. unified(words(i+1))) then if (naccepted .ge. ACC_MINIMUM) then call print_result() endif naccepted = 0 endif enddo end !!!----------------------------------------------------------------main program main include "anagram-f77-common.i" integer text_len integer read_text integer from, to data from,to/14,7/ comparisons = 0 if (to .lt. from ) stop '*** "from" > "to"' if (to .gt. LONGEST ) stop '*** "to" is too large' if (from .lt. SHORTEST) stop '*** "from" is too small' text_len = read_text() do len = from, to if (len .gt. from) write(*,*) write(*,100) 'len = ', len 100 format(A, I0) call find_all(text_len) enddo write(*, 110) text_len, comparisons 110 format("string length = ", I5, ", comparisons = ", I5) end
You can reach me by email at “lars dash 7 dot sdu dot se” or by telephone +46 705 189090