/*
  Copyright Dave Bone 1998 - 2014 
  All Rights Reserved. 
  No part of this document may be reproduced without written consent from the author.
	
FILE:		  la_expr.lex
Dates:		  20 May 2005
Purpose:	  Grammar's look ahead boundary expression syntax
*/
/@
@i "/usr/local/yacco2/copyright.w"
@** |la_expr| grammar.\fbreak
Evaluate the lookahead expression making sure
that the elements are well formed.
It's output is a set of enumerated values of Terminals
that is deposited into |T_parallel_la_boundary|'s |la_first_set_|.
|la_expr_lexical| creates its input token container.
As a help to my outputting of the CWEB document,
i rebuild the source la expression  while parsing it
and also deposit it back into |T_parallel_la_boundary|'s |cweb_la_srce_expr|.
This source expression gets outputted as part of a thread's cweb doc's banner.
\fbreak
Let's review the problem:\fbreak
There are ``+'' or ``-'' operators that  add or remove elements from the set.
How can the removal operator exist when Terminal elements are played with?
There are 2 situations that define groups of elements:\fbreak
\ptindent{1) eolr---a terminal representing all terminals including self}
\ptindent{2) rule---where 1st element of each subrule is a terminal or a rule}
The interesting part is recursion of a rule containing a rule.
Of course recurse is devine.
A subtlty creeps in on the use of the ``eolr'' terminal.
It was created to lower fat-city of lookahead sets
and to save the Carpal Tunnel Syndrome caused by typing.
If the ``eolr'' is the only terminal present in the lookahead expression
 then it's singular.
When operators are applied to it, it deconsolidates all its contents 
minus itself to allow for the operators to become effective.
For example A - eolr + B is a round about way to just have B in its set.
This could be taken as an error but i'm in a ``benign state of mind'' 
as the song goes.

Let's review what ``eolr'' can stand for.
If the lookahead boundary takes on all contexts, 
then it represents all terminals: 
no refinement to the run context is needed. 
This is a  lookahead(0) context.
This could be represented by the empty set but
i didn't want to complicate the parsing algorithm to deal
with this situation and so it's a xxx(1) lookahead 
where u plug in your favorite
bottomup algo descriptin in place of the hootch.
When refinement to the contents within the lookahead 
boundary is required, 
the ``eolr'' is followed
by the subtraction operator to remove the  ambiguous component(s)
from the boundary: ambiguity only shows up when 
lring the grammar. The ``eolr'' coupled with the removal operator
 is not a universal rule but is dictated by
the amount of effort the grammar writer needs
to declare the lookahead boundary.
For the explicit coders, each specific terminal can be stated
by the addition operator
but as an aid to the grammar writer 
a rule definition can do the same thing and reduce
the size of the written lookahead expression. 
This coding indirection  can be more
explicit by intent of a well chosen rule's name.
It costs a bit more in coding a rule definition but it is more flexible
in modification and in the reading of a grammar.
A rule definition is allowed where its only purpose is use within
a lookahead expression.
One can still define rules and not use them: a bit
 of junk code or dangling thoughts...
but to each their own. 
\O2 only honks when an undefined rule is referenced in a subrule.\fbreak 
\fbreak
Bugs:\fbreak
The |set_difference| algorithm is buggy with MS. 
So write my own!
The bugs are:\fbreak
 \ptindent{1) does not work A $-$ B where B is empty}
  \ptindent{2) A $-$ B have data and the resulting set is empty}
Item 2 throws an error: ``map/set iterators are not incrementable'' from VS2005.
\fbreak
set\_difference(A\_set$\rightarrow$begin(),A\_set$\rightarrow$end()
,B\_set$\rightarrow$begin(),B\_set$\rightarrow$end(),fset\_.begin());    
\fbreak
Another misfortune sequential reading of a set is not gauranteed
to be in ascending order. So use find probe to determine membership.
\fbreak
\fbreak
Exploding ``eolr'':\fbreak
It means use-all-terminals.
What is considered a Terminal?
Every terminal except itself and the meta terminals:
\PARshift{}, \ALLshift,\QUEshift{}, \INVshift{}, \REDshift{}, \TRAshift{}.
In parsing, the metaterminals represent contextual positions within the fsa state
 that act as action triggers. They only get actioned if they are present within 
 the current fsa's parse state.
 Their actions are ordered: shifts take precedence over reduce.
 The first shift tried is the current token in the parse stream.
 After this the metaterminals are tried in the following order: \fbreak
 \ptindent{\INVshift{} acts as an epsilon}
\ptindent{\ALLshift{} as the all shift}
\ptindent{\REDshift{} represents in a reduced LA set the  \PARshift{} operator}
\ptindent{\TRAshift{} is the transience operator used by \O2's linker}
\ptindent{\QUEshift{} is the wild error catching operator; the unexpected Tes}
\fbreak
Added support for Rule recycling as an new/delete optimization.

As rules are recycled, make sure that any temporary variables are
initialed properly or the part residues will haunt u.
Basically cleared the temporary set variable |fset_| in |Rt| and |Ra| rules
before its use.
@/

fsm	
(fsm-id "la_expr.lex"
,fsm-filename la_expr
,fsm-namespace NS_la_expr
,fsm-class Cla_expr {
  user-prefix-declaration
#include "o2_externs.h" 
extern void XLATE_SYMBOLS_FOR_cweave(const char* Sym_to_xlate,char* Xlated_sym);
  ***
   user-declaration
    public: 
    char ddd_[1024*32];
    int ddd_idx_;
    yacco2::CAbs_lr1_sym* gps_for_error_reporting_;
    void copy_str_into_buffer(std::string* Str);
    void copy_kstr_into_buffer(const char* Str);
    void unionize_sets(T_IN_STBL_SET_type* Add_to_set
                      ,T_IN_STBL_SET_type* A_set
                      ,T_IN_STBL_SET_type* B_set);
    void set_differences(T_IN_STBL_SET_type* Add_to_set
                        ,T_IN_STBL_SET_type* A_set
		        ,T_IN_STBL_SET_type* B_set);
    void explode_eolr(T_IN_STBL_SET_type* A_set);
    void add_element_to_set(T_in_stbl* Elem,T_IN_STBL_SET_type* Set_to_add_to);
    void add_rule_to_set
		(NS_yacco2_terminals::rule_def* Rule
		,T_IN_STBL_SET_type* Set_to_add_to
		,std::set<int>* Rules_already_processed);
  ***
  user-implementation
    void Cla_expr::copy_str_into_buffer(std::string* Str){   
      const char* y = Str->c_str(); 
      int x(0);
      for(;y[x]!=0;++x,++ddd_idx_)ddd_[ddd_idx_] = y[x];
      ddd_[ddd_idx_] = 0;
    }
    void Cla_expr::copy_kstr_into_buffer(const char* Str){
      const char* y = Str; 
      int x(0);
      for(;y[x]!=0;++x,++ddd_idx_)ddd_[ddd_idx_] = y[x];
      ddd_[ddd_idx_] = 0;
    }
    void Cla_expr::unionize_sets(T_IN_STBL_SET_type* Add_to_set
          ,T_IN_STBL_SET_type* A_set,T_IN_STBL_SET_type* B_set){
		if(A_set->find(STBL_T_ITEMS[LR1_Eolr]) !=A_set->end()){
			Add_to_set->insert(A_set->begin(),A_set->end());
			return;		
		}    
		if(B_set->find(STBL_T_ITEMS[LR1_Eolr]) !=B_set->end()){
			Add_to_set->insert(B_set->begin(),B_set->end());
			return;		
		}
		Add_to_set->insert(A_set->begin(),A_set->end());
		Add_to_set->insert(B_set->begin(),B_set->end());
    }
    void Cla_expr::explode_eolr(T_IN_STBL_SET_type* A_set){
      A_set->clear();
      T_enum_phrase* e_p =  O2_T_ENUM_PHASE;
      A_set->insert(STBL_T_ITEMS[LR1_Eog]);// eog

      int tt = e_p->total_enumerate();
      for(int x=8;x<tt;++x){// balance of meta Ts bypassed. Start from RC
        A_set->insert(STBL_T_ITEMS[x]);
      }
    }
    void Cla_expr::set_differences(T_IN_STBL_SET_type* Add_to_set
                ,T_IN_STBL_SET_type* A_set
		,T_IN_STBL_SET_type* B_set){
		if(A_set->find(STBL_T_ITEMS[LR1_Eolr]) !=A_set->end()){
			explode_eolr(A_set);
		}    
		if(B_set->find(STBL_T_ITEMS[LR1_Eolr]) !=B_set->end()){
			explode_eolr(B_set);
		}
		if(A_set->empty() == true){
			CAbs_lr1_sym* sym = new Err_empty_set_removal_in_la_expr;
			sym->set_rc(*gps_for_error_reporting_,__FILE__,__LINE__);
			parser__->set_stop_parse(true);
			ADD_TOKEN_TO_ERROR_QUEUE_FSM(*sym);
                        parser__->set_abort_parse(true);
		}

		T_IN_STBL_SET_ITER_type Ai = A_set->begin();
		T_IN_STBL_SET_ITER_type Aie = A_set->end();
		T_IN_STBL_SET_ITER_type Bie = B_set->end();
		
		T_IN_STBL_SET_ITER_type AinBi;
		for(;Ai != Aie;++Ai){		  
		  AinBi = B_set->find(*Ai);
		  if(AinBi != Bie){
		    continue; // bypass
		  }
		  Add_to_set->insert(*Ai);
		}
    }
    void Cla_expr::add_element_to_set(T_in_stbl* Elem,T_IN_STBL_SET_type* Set_to_add_to){
      if(Set_to_add_to->find(STBL_T_ITEMS[LR1_Eolr]) != Set_to_add_to->end()) return;
      if(Elem  == STBL_T_ITEMS[LR1_Eolr]){
        Set_to_add_to->clear();
      }
      Set_to_add_to->insert(Elem);
    }
    
    void Cla_expr::add_rule_to_set
		(NS_yacco2_terminals::rule_def* Rule
		,T_IN_STBL_SET_type* Set_to_add_to
		,std::set<int>* Rules_already_processed){
		using namespace NS_yacco2_terminals;
		int r_id = Rule->enum_id();
		if(Rules_already_processed->find(r_id) !=Rules_already_processed->end()) return;
		Rules_already_processed->insert(r_id);
		AST* r_t = Rule->rule_s_tree();
		set<yacco2::INT> elem_filter;
        elem_filter.insert(NS_yacco2_T_enum::T_Enum::T_T_subrule_def_);
		tok_can_ast_functor ast_functor;
        ast_prefix_wbreadth_only pwbo(*r_t,&ast_functor,&elem_filter,true);
        tok_can<AST*> tok_can(pwbo);
        for (int xxx = 0;tok_can.operator[](xxx) != yacco2::PTR_LR1_eog__;++xxx){
          T_subrule_def* sr_def = (T_subrule_def*)tok_can.operator[](xxx);
          vector<CAbs_lr1_sym*>* elems_list = sr_def->subrule_elems();
          CAbs_lr1_sym* sym = (*elems_list)[0];
          int id = sym->enumerated_id__;
          switch(id){
            case NS_yacco2_T_enum::T_Enum::T_T_eosubrule_: break;
            case NS_yacco2_T_enum::T_Enum::T_refered_T_: {
              refered_T* rt = (refered_T*)sym;
              add_element_to_set(rt->t_in_stbl(),Set_to_add_to);
              break;
            }
            case NS_yacco2_T_enum::T_Enum::T_refered_rule_: {
              refered_rule* rt = (refered_rule*)sym;
              rule_def* r = rt->Rule_in_stbl()->r_def();
              add_rule_to_set(r,Set_to_add_to,Rules_already_processed);
              break;
            }
            default:{
            }
          }
          
        }
    }
  ***
  constructor
    ddd_idx_ = 0;
    ddd_[ddd_idx_] = 0;
    gps_for_error_reporting_ = 0;
 ***
  op
    ddd_idx_ = 0;
    ddd_[ddd_idx_] = 0;
    gps_for_error_reporting_ = 0;
  ***
}
,fsm-version "1.0",fsm-date "24 mar 2004",fsm-debug "false"
,fsm-comments "Parse the lookahead expression after chaffe removed.")
@"/usr/local/yacco2/compiler/grammars/yacco2_T_includes.T"

rules{
Rla_expr  (){
  -> Re eog {
  op
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      if(sf->p1__->fset_.empty() == true){
		  CAbs_lr1_sym* sym = new Err_la_expr_calc_empty_set;
		  sym->set_rc(*fsm->gps_for_error_reporting_,__FILE__,__LINE__);
		  ADD_TOKEN_TO_ERROR_QUEUE(*sym);
		  rule_info__.parser__->set_abort_parse(true);
      }
      T_parallel_parser_phrase* la_ph = O2_PP_PHASE;
      T_parallel_la_boundary* la = la_ph->la_bndry();
      la->la_first_set(sf->p1__->fset_);
      la->cweb_la_srce_expr((const char*)&fsm->ddd_);
  ***
  }
}

Re  (
lhs {
  user-declaration
    public:
    T_IN_STBL_SET_type fset_;
  ***
  constructor
    fset_.clear();
 ***
  } 
 ){
  ->  Re Rplus Rt {
    op
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      fset_.clear();
      fsm->unionize_sets(&fset_,&sf->p1__->fset_,&sf->p3__->fset_);
    ***
    }
  ->  Re Rminus Rt {
    op
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      fset_.clear();
      fsm->set_differences(&fset_,&sf->p1__->fset_,&sf->p3__->fset_);
    ***
    }
  ->  Re Rerr_bad_oper
  ->  Rt {
    op
      fset_.clear();
      fset_.insert(sf->p1__->fset_.begin(),sf->p1__->fset_.end());
    ***
    }
}

Rerr_bad_oper  (){
  ->  |?| {
    op
      CAbs_lr1_sym* sym = new Err_bad_operator_in_la_expr;
      sym->set_rc(*sf->p1__,__FILE__,__LINE__);
      ADD_TOKEN_TO_ERROR_QUEUE(*sym);
      rule_info__.parser__->set_abort_parse(true);
    ***			
    } 
}

Rt  (
lhs {
    user-declaration
    public:
    T_IN_STBL_SET_type fset_;
  ***  
  constructor
    fset_.clear();
 ***
  }
){
  ->  "T-in-stbl" {
    op
   fset_.clear();
     using namespace NS_yacco2_terminals;
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      T_terminal_def* dt = sf->p1__->t_def();
      fsm->copy_kstr_into_buffer(" ");
      char xlate_file[Max_cweb_item_size];
      XLATE_SYMBOLS_FOR_cweave(dt->t_name()->c_str(),xlate_file);
      fsm->copy_kstr_into_buffer(xlate_file);
      fset_.clear();
      fset_.insert(sf->p1__);
    ***  
  }
  ->  "rule-in-stbl" {
    op
    fset_.clear();
    using namespace NS_yacco2_terminals;
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      rule_def* rt = sf->p1__->r_def();
      fsm->copy_kstr_into_buffer(" ");
      char xlate_file[Max_cweb_item_size];
      XLATE_SYMBOLS_FOR_cweave(rt->rule_name()->c_str(),xlate_file);
      fsm->copy_kstr_into_buffer(xlate_file);

      std::set<int> already_processed_rules;
      fsm->add_rule_to_set(rt,&fset_,&already_processed_rules);
    ***  
  }
  -> |+| { 
    op
  fset_.clear();
    using namespace NS_yacco2_terminals;
    int id = sf->p1__->enumerated_id__;
    switch(id){
		case T_Enum::T_LR1_all_shift_operator_:{
           Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
           fsm->copy_kstr_into_buffer(" $|+|$");
               fsm->add_element_to_set(STBL_T_ITEMS[id],&fset_);
          break; 
		}
		case T_Enum::T_LR1_invisible_shift_operator_:{
           Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
           fsm->copy_kstr_into_buffer(" $|.|$");
                fsm->add_element_to_set(STBL_T_ITEMS[id],&fset_);
          break; 
		}
		case T_Enum::T_LR1_parallel_operator_:{
           Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
           fsm->copy_kstr_into_buffer(" $|||$");
                fsm->add_element_to_set(STBL_T_ITEMS[id],&fset_);
          break; 
		}
		case T_Enum::T_LR1_fset_transience_operator_:{
           Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
           fsm->copy_kstr_into_buffer(" $|t|$");
                fsm->add_element_to_set(STBL_T_ITEMS[id],&fset_);
          break; 
		}

		case T_Enum::T_LR1_reduce_operator_:{
           Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
           fsm->copy_kstr_into_buffer(" $|r|$");
                fsm->add_element_to_set(STBL_T_ITEMS[id],&fset_);
          break; 
		}
		default:{
			CAbs_lr1_sym* sym = new Err_bad_term_in_la_expr;
			sym->set_rc(*sf->p1__,__FILE__,__LINE__);
			rule_info__.parser__->set_stop_parse(true);
			ADD_TOKEN_TO_ERROR_QUEUE(*sym);
            rule_info__.parser__->set_abort_parse(true);
		  }
    }
    ***			
    } 
}
Rminus  (){
  ->  - {
    op
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      fsm->gps_for_error_reporting_=sf->p1__;
      fsm->copy_kstr_into_buffer(" $-$");
    ***  
  }
}
Rplus  (){
  ->  + {
    op
      Cla_expr* fsm = (Cla_expr*)rule_info__.parser__->fsm_tbl__;
      fsm->copy_kstr_into_buffer(" $+$");
    ***  
  }
}
}// end of rules
