Syntax errors in Pluscal code for BayerMoore algorithm

128 views Asked by At

I attempted to port the BayerMoore algorithm for finding patterns in strings using the TLA+ toolkit. My main question is about the syntax. I think this error is thrown by these lines as the character to ASCII conversion isn't automatic. This conversion is also required in the rest of the code. How can I fix this ? Can I also ask if there is a way to execute the code to test just once without simulating using the model checker ?

  p := in_pattern[i];
  flag[ p ] := 1;

I understand that the state space here is enormous. So I just simulate now but that doesn't seem to complete either.

: Attempted to apply function: ( 0 :> -1 @@ 1 :> -1 @@ 2 :> -1 @@ 3 :> -1 @@ 4 :> -1 @@ ----- ----- to argument "T", which is not in the domain of the function.

This is the pluscal code. It is written in an imperative style as I am still learning TLA+

EXTENDS Integers, Sequences,Naturals, TLC

Max(x,y) == IF x < y THEN y ELSE x


(*--algorithm bayermoore
variables
           i,m,l,n,j,k,p,skips,
           flag \in [0 .. 256 -> -1 .. -1],
           in_pattern = <<"T", "E", "S", "T">>,
           in_text = <<"N", "T", "E", "S", "T", "E", "E", "D">>;
 
begin
 i := 0;          

 while i < Len(in_pattern) do
  i := i + 1;
  p := in_pattern[i];
  flag[ p ] := 1;
 end while;
 
 

m := Len(in_pattern); n := Len(in_text); j := m - 1; k := n - m;
 while j <= k do
  skips := 0;
     l := m - 1;
     while j >= 0 do
        if in_text[j] # in_text[i + j]
            then
            skips := Max(1,j - flag[in_text[i+j]]);
            skip;
            else
            j :=  j - 1;
        end if
     end while;
     if skips = 0
        then
        print i;
     end if
 end while;
 end algorithm; *)

I believe the loops will work but that isn't fully tested.

The Java code that works is this.

public class BoyerMoore {

    int R = 256;
    String pattern = "TEST";
    String text = "MYFUNNYONLINETESTCANSUCCEEDORNOT";
    int[] right = new int[256];
    public BoyerMoore(){
        for(int i = 0 ; i < R ; i++){
            right[i] = -1;
        }
        for( int i = 0 ; i < pattern.length() ; i ++ ){
            right[pattern.charAt(i)] = i;
        }
    }

    public int search(){
        int m = pattern.length();
        int n = text.length();
        int skip;

        for(int i = 0 ; i <= n - m ; i += skip){
            skip = 0;
            for( int j = m - 1 ; j >= 0 ;j --){
                if( pattern.charAt(j) != text.charAt(i+j)){
                    skip = Math.max(1,j - right[text.charAt(i+j)]);
                    break;
                }
            }
            if( skip == 0 ) return i;
        }
        return n;
    }
1

There are 1 answers

0
Mohan Radhakrishnan On BEST ANSWER

This helped me understand how to use the TLA+ toolkit and execute the TLC Model checker even though the spec./code may have a bug.

But mainly I used

struct == [a |-> 97, b |-> 98, c |-> 99]

and a subset of values in the TLA Toolkit configuration section.

This section

CONSTANTS Character, text, pattern
ASSUME text \in Seq(Character)
ASSUME pattern \in Seq(Character)

uses the configured values when the model checker executes.

So now the state space is small. This was suggested to me by Stephan Merz when I asked the TLA+ Google group.

I tried a few methods and eventually set it up like this in the toolkit. enter image description here The original function that I was looking for is this.

Ascii(char) == 96 + CHOOSE z \in 1 .. 26 :"abcdefghijklmnopqrstuvwxyz"[z] = char

But this throws an error

----------------------------- MODULE bayermoore -----------------------------
EXTENDS Integers, Sequences,Naturals, TLC
CONSTANTS Character, text, pattern
ASSUME text \in Seq(Character)
ASSUME pattern \in Seq(Character)
Max(x,y) == IF x < y THEN y ELSE x
struct == [a |-> 97, b |-> 98, c |-> 99]
(*--algorithm bayermoore

variables
           i,m,l,n,j,k,p,skips,x,y,
           flag = [g \in 0..256 |-> -1];

begin
 i := 1;          

 while i < Len(pattern) do
  p := struct[pattern[i]];
  flag[p] := 1;
  i := i + 1;
 end while;
 
 
i := 1;          
m := Len(pattern);
n := Len(text); 
k := n - m;
 while i <= k do
  skips := 0;
     j := m - 1;
     while j >= 0 do
            x := struct[pattern[j+1]];
            y := struct[text[i + j]];
            \* print ( "x " \o ToString(x) \o "y " \o ToString(y) );
            print ( "i " \o ToString(i) \o " j " \o ToString(j));
        if x # y then
            skips := Max(1,j - flag[y]);
            j := -1;
        end if;
        j :=  j - 1;
     end while;
  i := i + skips;  
  print ("i - " \o ToString(i)); 
 end while;
 end algorithm; *)