infinisil changed the topic of #nix-lang to: Channel for discussing Nix as a language - https://nixos.org/nix/manual/#chap-writing-nix-expressions - Logs: https://logs.nix.samueldr.com/nix-lang/
{`-`}_ has joined #nix-lang
{`-`}_ has joined #nix-lang
{`-`} has quit [Remote host closed the connection]
{`-`}_ is now known as {`-`}
pie__ has quit [Ping timeout: 244 seconds]
__monty__ has joined #nix-lang
pie__ has joined #nix-lang
pie__ has quit [Remote host closed the connection]
pie__ has joined #nix-lang
pie__ has quit [Ping timeout: 245 seconds]
pie__ has joined #nix-lang
pie__ is now known as [redacted]
[redacted] is now known as pie__
pie__ has quit [Ping timeout: 252 seconds]
pie_ has joined #nix-lang
Synthetica has joined #nix-lang
<Synthetica> re: #nixos and comments in hnix @ infinisil
<infinisil> How about doing the parsing in two stages
<infinisil> First stage for spans: [Char] -> [Span]
<infinisil> Then we can optionally filter irrelevant spans if we don't care about comments
<Synthetica> Isn't that just your classical lexing stage?
<infinisil> And then pass that to the second stage, doing [Span] -> AST (optionall with span annotations)
<infinisil> OH yeah, but can you do such filtering with how megaparsec works?
<infinisil> I'm not entirely sure how lexing is integrated in megaparsec, I just use the combinators
<Synthetica> Oh, wait, that's just lexing operations
<infinisil> Hmm
<Synthetica> But you could probably parse over a sequence of strings instead of a sequence of chars
pie__ has joined #nix-lang
<Synthetica> Might also help with the performance?
<infinisil> Maybe? not sure
* infinisil looks a bit closer at the hnix parser
<Synthetica> I think what you'd end up with is that you'd parse over [(Maybe Token, String)]
pie_ has quit [Ping timeout: 252 seconds]
<Synthetica> With all non-semantic elements being represented with (Nothing, "# Foobar"), for example
<infinisil> Synthetica: Either Token String?
<Synthetica> Possibly? But I'd say that if you take all second elements and cat them end-to-end, you should get your original program, and that prevents that
<infinisil> Hmm I see
<infinisil> Synthetica: You mean that it should then be `type Parser = Parsec Void [(Maybe Token, String)]`?
<Synthetica> Probably, maybe with something else instead of List
<infinisil> Yea, slight problem is that megaparsec doesn't easily work with non-standard stream
<infinisil> s
<infinisil> including arbitrary lists
<infinisil> I did submit 2 WIP PR's to fix that but they apparently are no good :P https://github.com/mrkkrp/megaparsec/pull/336
<{^_^}> mrkkrp/megaparsec#336 (by Infinisil, 2 weeks ago, closed): [WIP] Mono traversable
<Synthetica> Oh, bummer :(
<infinisil> Eh, that's just a small problem
<infinisil> Okay how about this
<infinisil> Wait no, I have to think some more
<infinisil> Synthetica: first parser :: Text -> [Segment], with `data Segment = Segment { span :: SourceSpan, lexeme :: Either Comment Token }`
<infinisil> (Maybe with something else than list)
<infinisil> Then the original can be recovered with the SourceSpan's and the original Text
<Synthetica> But you'd also want to store the original whitespace right?
<infinisil> Yeah no, with the SourceSpan's and the Comment/Token's (to prettyprint it)
<infinisil> Synthetica: Hmm, you mean whether it was tab/space/newline and how many?
<infinisil> I know we said 100% reproduce the original text, but is there any point to keeping the spacing exactly the same?
<infinisil> I guess if people have certain editor preferences and use tabs to align stuff..
<Synthetica> Probably, for example in nix-linter I'd like to rewrite part of a document without rewriting the whole thing
<infinisil> Hmm yeah, okay spacing information needs to be kept as well
<infinisil> Synthetica: How about: first parser :: Text -> [Segment], with `data Segment = Segment { span :: SourceSpan, lexeme :: Lexeme }` with `data Lexeme = Comment Comment | Token Token | Spacing Spacing`
<infinisil> This looks pretty good to me
<Synthetica> That might work, yeah
<Synthetica> Maybe some mechanism to allow for synthetic code, like wrapping `Lexeme` in a `Maybe`?
<infinisil> Synthetica code? :P No what's synthetic code and the point of it?
<Synthetica> Program-generated output, for example when replacing part of an expression you don't want to have to specify the spacing before-hand, you just want to say "replace this AST with this other AST"
<infinisil> I think this would probably come at a later stage, once the actual AST has been parsed from the tokens. Let's actually make the second parser be :: [(SourceSpan, Token)] -> AST
<infinisil> Hmm.. I think i see what you mean now
<Synthetica> infinisil: Going to bed btw, won't be too connected for the next few days. If you want to reach me, email is in maintainers.nix
<infinisil> Np, I'll post to the issue if I got something nice
__monty__ has quit [Quit: leaving]