<Synthetica>
Oh, wait, that's just lexing operations
<infinisil>
Hmm
<Synthetica>
But you could probably parse over a sequence of strings instead of a sequence of chars
pie__ has joined #nix-lang
<Synthetica>
Might also help with the performance?
<infinisil>
Maybe? not sure
* infinisil
looks a bit closer at the hnix parser
<Synthetica>
I think what you'd end up with is that you'd parse over [(Maybe Token, String)]
pie_ has quit [Ping timeout: 252 seconds]
<Synthetica>
With all non-semantic elements being represented with (Nothing, "# Foobar"), for example
<infinisil>
Synthetica: Either Token String?
<Synthetica>
Possibly? But I'd say that if you take all second elements and cat them end-to-end, you should get your original program, and that prevents that
<infinisil>
Hmm I see
<infinisil>
Synthetica: You mean that it should then be `type Parser = Parsec Void [(Maybe Token, String)]`?
<Synthetica>
Probably, maybe with something else instead of List
<infinisil>
Yea, slight problem is that megaparsec doesn't easily work with non-standard stream
<infinisil>
Synthetica: first parser :: Text -> [Segment], with `data Segment = Segment { span :: SourceSpan, lexeme :: Either Comment Token }`
<infinisil>
(Maybe with something else than list)
<infinisil>
Then the original can be recovered with the SourceSpan's and the original Text
<Synthetica>
But you'd also want to store the original whitespace right?
<infinisil>
Yeah no, with the SourceSpan's and the Comment/Token's (to prettyprint it)
<infinisil>
Synthetica: Hmm, you mean whether it was tab/space/newline and how many?
<infinisil>
I know we said 100% reproduce the original text, but is there any point to keeping the spacing exactly the same?
<infinisil>
I guess if people have certain editor preferences and use tabs to align stuff..
<Synthetica>
Probably, for example in nix-linter I'd like to rewrite part of a document without rewriting the whole thing
<infinisil>
Hmm yeah, okay spacing information needs to be kept as well
<infinisil>
Synthetica: How about: first parser :: Text -> [Segment], with `data Segment = Segment { span :: SourceSpan, lexeme :: Lexeme }` with `data Lexeme = Comment Comment | Token Token | Spacing Spacing`
<infinisil>
This looks pretty good to me
<Synthetica>
That might work, yeah
<Synthetica>
Maybe some mechanism to allow for synthetic code, like wrapping `Lexeme` in a `Maybe`?
<infinisil>
Synthetica code? :P No what's synthetic code and the point of it?
<Synthetica>
Program-generated output, for example when replacing part of an expression you don't want to have to specify the spacing before-hand, you just want to say "replace this AST with this other AST"
<infinisil>
I think this would probably come at a later stage, once the actual AST has been parsed from the tokens. Let's actually make the second parser be :: [(SourceSpan, Token)] -> AST
<infinisil>
Hmm.. I think i see what you mean now
<Synthetica>
infinisil: Going to bed btw, won't be too connected for the next few days. If you want to reach me, email is in maintainers.nix
<infinisil>
Np, I'll post to the issue if I got something nice