<BarrensZeppelin>
Status update for the PEP 701 f-strings project: I must admit that the learning curve was quite steep, but changes to the tokenizer are coming along nicely. The only missing part now is format specifiers and possibly error message parity with CPython.
<BarrensZeppelin>
I've created lots of f-string tokenizer tests, and they are mostly implemented by piping a string to `uvx python@3.12 -m tokenize -e` and translating the output into a structured format.
<BarrensZeppelin>
I'm wondering if it would make sense to automate this (possibly with a code generation step, to avoid introducing a dependency on uvx).
<BarrensZeppelin>
It would also make it easy to run fuzz tests with hypothesis.
BarrensZeppelin has quit [Quit: Client closed]
<LarstiQ>
what's the need for the automation after you've created the tests? Doesn't cpython 3.12 have tests that can be imported?
<LarstiQ>
great to hear about the progress!
otisolsen70 has joined #pypy
otisolsen70 has quit [Remote host closed the connection]
otisolsen70 has joined #pypy
BarrensZeppelin has joined #pypy
<BarrensZeppelin>
Right, good question! In PyPy (for 3.11) the internal RPython tokenizer is not exposed to application code at all (afaik). The Python 3.11 standard library includes a pure Python tokenizer and a C extension tokenizer. The standard library tokenizer tests test both of these implementations.
<BarrensZeppelin>
In Python 3.12 the pure Python tokenizer was removed, and the C extension tokenizer does not exist in PyPy, so I guess it would be necessary to actually expose the internal RPython tokenizer somehow.
<BarrensZeppelin>
When the RPython tokenizer is exposed through the tokenizer module, the standard library tests will be useful to test it. I think I'd need help with how to tackle this in the best way, though.
BarrensZeppelin has quit [Quit: Client closed]
<nimaje>
why not keeping the pure python tokenizer?
otisolsen70 has quit [Quit: Leaving]
[Arfrever] has quit [Ping timeout: 276 seconds]
[Arfrever] has joined #pypy
[Arfrever] has quit [Ping timeout: 252 seconds]
[Arfrever] has joined #pypy
Dejan has quit [Quit: Leaving]
BarrensZeppelin has joined #pypy
<BarrensZeppelin>
I assume that the CPython developers did not want to maintain two separate tokenizer implementations. Especially with the changes needed for Python 3.12. If you're proposing that PyPy keeps the pure Python tokenizer, I guess you'd have the same problem.