Oh, wait, that's just lexing operations
But you could probably parse over a sequence of strings instead of a sequence of chars
pie__ has joined #nix-lang
Might also help with the performance?
Maybe? not sure
* infinisil
looks a bit closer at the hnix parser
I think what you'd end up with is that you'd parse over [(Maybe Token, String)]
pie_ has quit [Ping timeout: 252 seconds]
With all non-semantic elements being represented with (Nothing, "# Foobar"), for example
Synthetica: Either Token String?
Possibly? But I'd say that if you take all second elements and cat them end-to-end, you should get your original program, and that prevents that
Hmm I see
Synthetica: You mean that it should then be `type Parser = Parsec Void [(Maybe Token, String)]`?
Probably, maybe with something else instead of List
Yea, slight problem is that megaparsec doesn't easily work with non-standard stream
Synthetica: first parser :: Text -> [Segment], with `data Segment = Segment { span :: SourceSpan, lexeme :: Either Comment Token }`
(Maybe with something else than list)
Then the original can be recovered with the SourceSpan's and the original Text
But you'd also want to store the original whitespace right?
Yeah no, with the SourceSpan's and the Comment/Token's (to prettyprint it)
Synthetica: Hmm, you mean whether it was tab/space/newline and how many?
I know we said 100% reproduce the original text, but is there any point to keeping the spacing exactly the same?
I guess if people have certain editor preferences and use tabs to align stuff..
Probably, for example in nix-linter I'd like to rewrite part of a document without rewriting the whole thing
Hmm yeah, okay spacing information needs to be kept as well
Synthetica: How about: first parser :: Text -> [Segment], with `data Segment = Segment { span :: SourceSpan, lexeme :: Lexeme }` with `data Lexeme = Comment Comment | Token Token | Spacing Spacing`
This looks pretty good to me
That might work, yeah
Maybe some mechanism to allow for synthetic code, like wrapping `Lexeme` in a `Maybe`?
Synthetica code? :P No what's synthetic code and the point of it?
Program-generated output, for example when replacing part of an expression you don't want to have to specify the spacing before-hand, you just want to say "replace this AST with this other AST"
I think this would probably come at a later stage, once the actual AST has been parsed from the tokens. Let's actually make the second parser be :: [(SourceSpan, Token)] -> AST
Hmm.. I think i see what you mean now
infinisil: Going to bed btw, won't be too connected for the next few days. If you want to reach me, email is in maintainers.nix
Np, I'll post to the issue if I got something nice