[Rubygrammar-grammarians] embedded expressions and dynamic quotes

Terence Parr parrt at cs.usfca.edu
Sat Nov 26 18:39:52 EST 2005


On Nov 26, 2005, at 2:04 PM, Martin Traverso wrote:

> Here's another tricky one: parsing double quoted strings,  
> especially when they use the %Q syntax.
>
> First, strings can contain nested ruby expressions (full programs,  
> actually):
>
> "1 + 2 = #{1 + 2}"  -> 1 + 2 = 3

I've seen this in groovy.  This looks easier as the #{ clearly  
identifies when to start a new nested scope.  This nested expression  
stuff is not that big of a deal in this case.  Seems to me I made an  
example of this for the groovy guys...hmm...wonder what happened to  
that.  When you are parsing a string, the #{ triggers an action that  
creates a ruby parser instance and calls the expr rule to match that  
on the input stream.

Heh, I found it!  Ok, here is some nested crap in comments:

   int i = 0;
/** @author foo {{z=3; q=4;}} {yy=33;}*/
method foo() {
   int j = i;
   i = 4;
}
/**  @author    bar
*/
method zero() {
   return 0;
}

I used a simple java like language as an example with both embedded  
javadoc @author stuff and embedded expressions within the comments  
trigger by simple {...}.

> Double-quoted strings can also be constructed with % or %Q and an  
> delimiter character, according to these rules (from the Pickaxe book).
>
> "Following the type character is a delimiter, which can be any  
> nonalphabetic or nonmultibyte
> character. If the delimiter is one of the characters (, [, {, or <,  
> the literal
> consists of the characters up to the matching closing delimiter,  
> taking account of nested
> delimiter pairs. For all other delimiters, the literal comprises  
> the characters up to the
> next occurrence of the delimiter character."

Holy crap!  That is VERY tough for a static lexer to deal with.  I  
wonder if a semantic predicate will help us out here...hmm...

> Here are a few of examples of valid strings:
>
> %Q/1 + 2 = #{ 1 + 2 }/        # -> 1 + 2 = 3

Wow.  Might have to have the input scanner solve this. Yes, that  
would be easiest.  Convert the start / stop to some bizarre char  
sequence unknown to ruby users.  Then we effectively normalize these  
into something static the lexer can scarf. :)  Ok, nothing impossible  
so far.

Ter




More information about the Rubygrammar-grammarians mailing list