Análisis de sintaxis basadas en sangría en Haskell's Parsec

Estoy tratando de analizar un lenguaje basado en sangrías (piense en Python, Haskell sí mismo, Boo, YAML) en Haskell usando Parsec. He visto la biblioteca de IndentParser, y parece que es la combinación perfecta, pero lo que no puedo entender es cómo hacer que mi TokenParser se convierta en un analizador de sangrías. Aquí está el código que tengo hasta ahora:Análisis de sintaxis basadas en sangría en Haskell's Parsec

import qualified Text.ParserCombinators.Parsec.Token as T 
import qualified Text.ParserCombinators.Parsec.IndentParser.Token as IT 

lexer = T.makeTokenParser mylangDef 
ident = IT.identifier lexer

Esto arroja el error:

parser2.hs:29:28: 
    Couldn't match expected type `IT.TokenParser st' 
      against inferred type `T.GenTokenParser s u m' 
    In the first argument of `IT.identifier', namely `lexer' 
    In the expression: IT.identifier lexer 
    In the definition of `ident': ident = IT.identifier lexer

¿Qué estoy haciendo mal? ¿Cómo debería crear un IT.TokenParser? ¿O está roto IndentParser y debe evitarse?

Fuente

2010-06-11 pavpanchekha

Parece que estás usando Parsec 3 aquí, mientras IndentParser espera Parsec 2. Su ejemplo compila para mí con -package parsec-2.1.0.1.

Por lo tanto, IndentParser no se rompe necesariamente, pero los autores deberían haber sido más específicos acerca de las versiones en la lista de dependencias. Es posible tener ambas versiones de Parsec instaladas, por lo que no hay ninguna razón por la que no deba usar IndentParser a menos que esté comprometido a usar Parsec 3 por otros motivos.

ACTUALIZACIÓN: En realidad no hay cambios en el origen son necesarios para obtener IdentParser trabajar con Parsec 3. El problema de que los dos estaban teniendo parece ser causado por el hecho de que tiene una cabal-install"soft preference" de Parsec 2 . usted simplemente puede volver a instalar IndentParser con una limitación explícita de la versión Parsec:

cabal install IndentParser --reinstall --constraint="parsec >= 3"

alternativa, se puede descargar el source y build and install in the normal way.

Fuente

2010-06-11 16:21:26

¡Usted, señor, es increíble! ¡Gracias! ¿Cómo sabías que estaba usando Parsec 3? ¿Una adivinanza? Porque creo que mi ejemplo podría ser ... – pavpanchekha

Me temo que mi trabajo de detective aquí no fue muy emocionante: compilé tu código con Parsec 3, obtuve un error similar al tuyo y luego probé Parsec 2, que trabajó. Por cierto, parece que no sería muy difícil hacer que IndentParser funcione con Parsec 3; puede considerar darle una oportunidad si encuentra que IndentParser es útil. –

Podría, pero ahora mismo solo estoy aprendiendo a Haskell; Me temo que me perdería en una base de código extranjera como esa. – pavpanchekha

Aquí hay un conjunto de combinadores de analizadores que preparo para Parsec 3 que pueden usarse para el diseño de estilo Haskell, que podrían serle útiles. Las consideraciones clave son que laidout se inicia y ejecuta una regla de disposición, y que debe utilizar los combinadores space y spaced proporcionados en lugar de los combinadores Parsec de reserva para el mismo propósito. Debido a la interacción del diseño y los comentarios, tuve que combinar el análisis de comentarios en el tokenizer.

{-# LANGUAGE FlexibleContexts, FlexibleInstances, MultiParamTypeClasses #-} 
module Text.Parsec.Layout 
    (laidout   -- repeat a parser in layout, separated by (virtual) semicolons 
    , space   -- consumes one or more spaces, comments, and onside newlines in a layout rule 
    , maybeFollowedBy 
    , spaced   -- (`maybeFollowedBy` space) 
    , LayoutEnv  -- type needed to describe parsers 
    , defaultLayoutEnv -- a fresh layout 
    , semi    -- semicolon or virtual semicolon 
    ) where 

import Control.Applicative ((<$>)) 
import Control.Monad (guard) 

import Data.Char (isSpace) 

import Text.Parsec.Combinator 
import Text.Parsec.Pos 
import Text.Parsec.Prim hiding (State) 
import Text.Parsec.Char hiding (space) 

data LayoutContext = NoLayout | Layout Int deriving (Eq,Ord,Show) 

data LayoutEnv = Env 
    { envLayout :: [LayoutContext] 
    , envBol :: Bool -- if true, must run offside calculation 
    } 

defaultLayoutEnv :: LayoutEnv 
defaultLayoutEnv = Env [] True 

pushContext :: Stream s m c => LayoutContext -> ParsecT s LayoutEnv m() 
pushContext ctx = modifyState $ \env -> env { envLayout = ctx:envLayout env } 

popContext :: Stream s m c => String -> ParsecT s LayoutEnv m() 
popContext loc = do 
    (_:xs) <- envLayout <$> getState 
    modifyState $ \env' -> env' { envLayout = xs } 
    <|> unexpected ("empty context for " ++ loc) 

getIndentation :: Stream s m c => ParsecT s LayoutEnv m Int 
getIndentation = depth . envLayout <$> getState where 
    depth :: [LayoutContext] -> Int 
    depth (Layout n:_) = n 
    depth _ = 0 

pushCurrentContext :: Stream s m c => ParsecT s LayoutEnv m() 
pushCurrentContext = do 
    indent <- getIndentation 
    col <- sourceColumn <$> getPosition 
    pushContext . Layout $ max (indent+1) col 

maybeFollowedBy :: Stream s m c => ParsecT s u m a -> ParsecT s u m b -> ParsecT s u m a 
t `maybeFollowedBy` x = do t' <- t; optional x; return t' 

spaced :: Stream s m Char => ParsecT s LayoutEnv m a -> ParsecT s LayoutEnv m a 
spaced t = t `maybeFollowedBy` space 

data Layout = VSemi | VBrace | Other Char deriving (Eq,Ord,Show) 

-- TODO: Parse C-style #line pragmas out here 
layout :: Stream s m Char => ParsecT s LayoutEnv m Layout 
layout = try $ do 
    bol <- envBol <$> getState 
    whitespace False (cont bol) 
    where 
    cont :: Stream s m Char => Bool -> Bool -> ParsecT s LayoutEnv m Layout 
    cont True = offside 
    cont False = onside 

    -- TODO: Parse nestable {-# LINE ... #-} pragmas in here 
    whitespace :: Stream s m Char => 
     Bool -> (Bool -> ParsecT s LayoutEnv m Layout) -> ParsecT s LayoutEnv m Layout 
    whitespace x k = 
      try (string "{-" >> nested k >>= whitespace True) 
     <|> try comment 
     <|> do newline; whitespace True offside 
     <|> do tab; whitespace True k 
     <|> do (satisfy isSpace <?> "space"); whitespace True k 
     <|> k x 

    comment :: Stream s m Char => ParsecT s LayoutEnv m Layout 
    comment = do 
     string "--" 
     many (satisfy ('\n'/=)) 
     newline 
     whitespace True offside 

    nested :: Stream s m Char => 
     (Bool -> ParsecT s LayoutEnv m Layout) -> 
     ParsecT s LayoutEnv m (Bool -> ParsecT s LayoutEnv m Layout) 
    nested k = 
      try (do string "-}"; return k) 
     <|> try (do string "{-"; k' <- nested k; nested k') 
     <|> do newline; nested offside 
     <|> do anyChar; nested k 

    offside :: Stream s m Char => Bool -> ParsecT s LayoutEnv m Layout 
    offside x = do 
     p <- getPosition 
     pos <- compare (sourceColumn p) <$> getIndentation 
     case pos of 
      LT -> do 
       popContext "the offside rule" 
       modifyState $ \env -> env { envBol = True } 
       return VBrace 
      EQ -> return VSemi 
      GT -> onside x 

    -- we remained onside. 
    -- If we skipped any comments, or moved to a new line and stayed onside, we return a single a ' ', 
    -- otherwise we provide the next char 
    onside :: Stream s m Char => Bool -> ParsecT s LayoutEnv m Layout 
    onside True = return $ Other ' ' 
    onside False = do 
     modifyState $ \env -> env { envBol = False } 
     Other <$> anyChar 

layoutSatisfies :: Stream s m Char => (Layout -> Bool) -> ParsecT s LayoutEnv m() 
layoutSatisfies p = guard . p =<< layout 

virtual_lbrace :: Stream s m Char => ParsecT s LayoutEnv m() 
virtual_lbrace = pushCurrentContext 

virtual_rbrace :: Stream s m Char => ParsecT s LayoutEnv m() 
virtual_rbrace = try (layoutSatisfies (VBrace ==) <?> "outdent") 

-- recognize a run of one or more spaces including onside carriage returns in layout 
space :: Stream s m Char => ParsecT s LayoutEnv m String 
space = do 
    try $ layoutSatisfies (Other ' ' ==) 
    return " " 
    <?> "space" 

-- recognize a semicolon including a virtual semicolon in layout 
semi :: Stream s m Char => ParsecT s LayoutEnv m String 
semi = do 
    try $ layoutSatisfies p 
    return ";" 
    <?> "semi-colon" 
    where 
     p VSemi = True 
     p (Other ';') = True 
     p _ = False 

lbrace :: Stream s m Char => ParsecT s LayoutEnv m String 
lbrace = do 
    char '{' 
    pushContext NoLayout 
    return "{" 

rbrace :: Stream s m Char => ParsecT s LayoutEnv m String 
rbrace = do 
    char '}' 
    popContext "a right brace" 
    return "}" 

laidout :: Stream s m Char => ParsecT s LayoutEnv m a -> ParsecT s LayoutEnv m [a] 
laidout p = try (braced statements) <|> vbraced statements where 
    braced = between (spaced lbrace) (spaced rbrace) 
    vbraced = between (spaced virtual_lbrace) (spaced virtual_rbrace) 
    statements = p `sepBy` spaced semi

Fuente

2010-06-11 14:30:26

¿Puedes dar un ejemplo de cómo usar esto? Lo intenté [como este] (https://gist.github.com/gergoerdi/af1829b18ea80e21ba79728a5d271cd9) pero no puedo obtener 'bindings' para aceptar algo tan simple como' unlines ["x = y", "a = b "]'. – Cactus

Actualmente creo que la fuente anterior está rota, pero no he tenido la oportunidad de volver a visitarla. –

Análisis de sintaxis basadas en sangría en Haskell's Parsec

Respuesta

Cuestiones relacionadas