i writing hand recursive-descent parser small language. in lexer have:
trait token{def position:int} trait keyword  extends token trait operator extends token  case class identifier(position:int, txt:string) extends token case class if        (position:int)             extends keyword case class plus      (position:int)             extends operator /* etcetera; 1 case class per token type */ my parser works well, , incorporate error recovery: replacing, inserting or discarding tokens until synchronization point.
for that, handy have function that, in invalid scala, this
def scanfor(tokenset:set[tokenclass], lookahead:int) = {   lexer.upcomingtokens.take(lookahead).find{ token =>     tokenset.exists(tokenclass => token.isinstanceof[tokenclass])   } } which call, example: scanfor(set(plus, minus, times, dividedby), 4)
however tokenclass, of course, not valid type, , don't know how create previous set.
as alternatives:
- i create new trait , make token classes in token set want check against extend trait, , instanceofcheck against trait. however, may have several of sets, make them hard name, , code hard maintain later on.
- i create isxxx:token=>boolean functions, , make sets of those, seems unelegant
any suggestions?
i recommend, if there handful of such combinations, using additional trait. it's easy write , understand, , fast @ runtime. it's not bad say
case class plus(position: int) extends operator arithmetic precedence7 unary but there wide range of alternatives.
if don't mind finicky manual maintenance process , need fast, defining id number (which must manually keep distinct) each token type allow use set[int] or bitset or long select classes like.  can set operations (union, intersection) build these selectors each other.  it's not hard write unit tests make finicky bit little more reliable.  if can @ least manage list types:
val = seq(plus, times, if /* etc */) assert(everyone.length == everyone.map(_.id).toset.size) so shouldn't alarmed approach if decide performance , composability essential.
you can write custom extractors can (more slowly) pull out right subset of tokens pattern matching. example,
object arithop {   def unapply(t: token): option[operator] = t match {     case o: operator => o match {       case _: plus | _: minus | _: times | _: dividedby => some(o)       case _ => none     }     case _ => none   } } will give none if it's not right type of operation.  (in case, i'm assuming there's no parent other operator.)
finally, express types unions , hlists , pick them out way using shapeless, don't have experience doing parser, i'm not sure of difficulties might encounter.
Comments
Post a Comment