itamarst has quit [Quit: Connection closed for inactivity]
_whitelogger has joined #pypy
whitequark_ has joined #pypy
<whitequark_>
krono: (both URLs will work indefinitely, the old one just has a 301 Moved Permanently now)
<whitequark_>
folks, i'm seeing PyPy have really bad (worse than CPython, though not by much) performance on what appears on the surface a very simple transcoding function
<whitequark_>
does anyone have ideas as to why it might be slow? it operates on one memoryview and one bytearray, doesn't (or at least, shouldn't at the face of it) allocate besides expanding that byte array, only touches integers otherwise
<whitequark_>
alternately, any pointers to how to find out what makes it slow would be just as appreciated
<krono>
Good point
<whitequark_>
i guess one benefit is that the new URL is shorter and easier to pronounce :)
<whitequark_>
(i'm migrating away the public benefit infrastructure that i run to not share machines with my private infrastructure, partly because i'm concerned one could be used to attack the other, partly because i would like the public benefit infrastructure i run to have a bus factor of more than 1)
<whitequark_>
oh, pypy's terminal is very pretty during the build
<krono>
it is, isn't it?
<krono>
If you want to go down a rabbit hole, the PYPYLOG-env var gives you _a lot_ of info what is going on.
<krono>
For example, if you want to know what the jit is doing, you can do ` PYPYLOG=jit-log-opt:jit-summary:out.pypylog ITERS=100 PURE_PYTHON=1 pypy3 benchmark.py`
<krono>
and look at out.pypylog (it has a… peculiar format tho)
<krono>
(oh, its `PYPYLOG=jit-log-opt,jit-summary:out.pypylog`, a typo, sorry)
<krono>
Here's an example pypylog for a slightly amended benchmark py: https://bpa.st/ZU2A
<krono>
doesnt look toooo bad from first view tho…
<whitequark_>
why is it allocating two objects on each iteration?
<whitequark_>
apparently, removing `memoryview.cast()` and using just `memoryview()` improves performance by a factor of 10 (!)
<whitequark_>
actually it's more like 20
<whitequark_>
13 MB/s to 236 MB/s
<LarstiQ>
oh that's a nice one
<whitequark_>
someone should probably fix `.cast()` so that a no-op cast doesn't cause this slowdown at least
<whitequark_>
(even if i call `.cast('B')` on a memoryview with format `'B'` it still does the thing where it allocates twice per iteration)
<whitequark_>
casting a memoryview is so slow that ditching memoryview manipulation altogether and using `bytes` (with copying on every iteration that commits a range) is still faster