ChanServ changed the topic of #yosys to: Yosys Open SYnthesis Suite: https://github.com/YosysHQ/yosys/ | Channel logs: https://libera.catirclogs.org/yosys/ | Bridged to #yosys:matrix.org
tpb has quit [Remote host closed the connection]
tpb has joined #yosys
_whitelogger has joined #yosys
_whitelogger has joined #yosys
mithro has quit [Ping timeout: 272 seconds]
mithro has joined #yosys
krispaul has joined #yosys
kristianpaul has quit [Ping timeout: 240 seconds]
_whitelogger has joined #yosys
xutaxkamay_ has joined #yosys
xutaxkamay has quit [Read error: Connection reset by peer]
xutaxkamay_ is now known as xutaxkamay
krispaul has quit [Quit: WeeChat 3.5]
kristianpaul has joined #yosys
nonchip has quit [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
nonchip has joined #yosys
<ysionneau> hi, I'm trying to understand a timing report from nextpnr here: https://paste.centos.org/view/44380199 the Info: Critical path report for cross-domain path 'posedge $glbnet$ecp5pll_clkout0' -> '<async>':
<tpb> Title: Untitled - Pastebin Service (at paste.centos.org)
<ysionneau> I can see the path starts with: clk-to-q 4.26 4.26 Source storage_3.0.0.DOB4
<ysionneau> storage_3.0.0.DOB4 is a DP16KD (dual ported ram) of ECP5
<ysionneau> well, that's storage_3.0.0 and I guess DOB4 is the DataOut port B bit 4
<ysionneau> clk-to-q is the time for the DP16KD to output data after a clock edge? is it normal it's so slow?!
<ysionneau> it's eating so much of my timing
<ysionneau> I'm trying to find this information in Lattice datasheet but all I find is wave diagram and this timing annotation : tCOO_EBR
<ysionneau> but they never say how much tCOO_EBR is
lofty[m] has joined #yosys
<lofty[m]> ysionneau: what's your nextpnr command line?
<lofty[m]> I should point out that "<async>" usually implies "the outside world", through I/O or such.
<ysionneau> yes the end of this route is an IO
<ysionneau> but the remaining part of this (critical) path seems quite normal
<ysionneau> I'm just afraid by this 4.26 ns which is enormous
<ysionneau> hmm maybe it's because it's in "NOREG" mode (output is not registered), I get "REGMODE_A": "NOREG", and same for B
<ysionneau> it's sad because I instanciated a LiteX SyncFifo with buffered=True which I think is meant to mean "please put a reg at output"
<ysionneau> but it seems yosys did not understand something and put a NOREG DP16KD
<ysionneau> hmhm it seems indeed the verilog does not seem *that* buffered, at all :o
<ysionneau> so I guess it's a LiteX or migen issue
<ysionneau> lofty[m]: cmdline is nextpnr-ecp5 --json hydrasucrela.json --lpf hydrasucrela.lpf --textcfg hydrasucrela.config --12k --package CABGA381 --speed 8 --timing-allow-fail --seed 1
whitequark[cis] has joined #yosys
<whitequark[cis]> SyncFIFOBuffered puts one register at the output of an async memory
<whitequark[cis]> if you want one more pipeline stage you have to add it yourself
<whitequark[cis]> AsyncFIFOBuffered however adds a pipeline stage on top of an existing sync memory
<ysionneau> hmhm
<ysionneau> whitequark[cis]: I can see this in the generated verilog https://paste.centos.org/view/7209e8b4
<tpb> Title: Untitled - Pastebin Service (at paste.centos.org)
<ysionneau> iiuc the fact that it's read in a always @posedge XXX block means the output is buffered?
<whitequark[cis]> that looks to me like a normal sync memory macro, with one cycle latency
<ysionneau> shouldn't yosys then generate a OUTREG DP16KD ?
<ysionneau> or I'm missing something
<whitequark[cis]> a BRAM on an FPGA always has registered output
<whitequark[cis]> i'm not deeply familiar with ECP5 but i'm fairly sure that "NOREG" means 1 cycle of latency and "OUTREG" means 2 cycles
<ysionneau> oh
<whitequark[cis]> DP16KD will never be fully async, no FPGA BRAM macro is
<ysionneau> ok so maybe my understanding issue is what is an asyncfifo vs a syncfifo
<whitequark[cis]> SyncFIFO has reader and writer in the same clock domain, AsyncFIFO has them in potentially different domains
<ysionneau> ok
<whitequark[cis]> non-buffered SyncFIFO in Migen uses async memory primitives (LUTRAM or FFRAM)
<whitequark[cis]> distrubuted RAM in Xilinx lingo
philtor has quit [Remote host closed the connection]
<ysionneau> which can be quite slow I guess when it's big
<whitequark[cis]> everything else uses BRAM
<whitequark[cis]> quite, yes
<ysionneau> so in my case read and write ports are in the same clk domain
<ysionneau> so I guess I am OK to use a SyncFIFO
<whitequark[cis]> you want SyncFIFOBuffered
<ysionneau> I've put buffered=True
<whitequark[cis]> there is almost never any reason to use the non buffered version
<whitequark[cis]> (in fact it has an incompatible interface...)
<ysionneau> maybe I'm reading this wrong but isn't SyncFIFOBuffered the same as SyncFifo(buffered=True) ?
<whitequark[cis]> it is iirc (been many years since i touched migen but i don't think it changed much)
<ysionneau> so you think I need to add yet another level of buffering after?
<ysionneau> so that yosys put a OUTREG DPK16KD
<ysionneau> DP16KD*
<whitequark[cis]> adding a pipeline stage would be a solution to slow clk-to-q yeah
<whitequark[cis]> whether it will be folded into DP16KD im not sure
<whitequark[cis]> but it should improve timing either way
<ysionneau> sure it would cut my critical path
<ysionneau> is there an easy way to put a "buffer" for all signals in between two pipeline elements in the stream.* api?
<ysionneau> source/sink api I mean
<whitequark[cis]> thats a litex thing, i never learned litex
<whitequark[cis]> but it is like 3 lines of code
<ysionneau> because I can't just do sync += [self.source.data.eq(fifo.source.data)] I should also buffer .valid
<ysionneau> well you're right if it's just data and valid ... it's 2 lines
<whitequark[cis]> if(~valid | ready) dest.data.eq(src.data), dest.valid.eq(src.valid) else dest.valid.eq(0)
<whitequark[cis]> all sync
<ysionneau> I've moved self.packet_buffer_pipeline.source.connect(self.source) from a comb to a sync block, I think it's ok
<ysionneau> let's try
<whitequark[cis]> no that will be totally broken
<whitequark[cis]> migen Record.connect is fucked by design, you should never put it in sync
<ysionneau> arg ok
<whitequark[cis]> we redesigned it in amaranth to not be a complete disaster
<whitequark[cis]> and a part of that is making it impossible to put it in sync
<ysionneau> I didn't know about the Record.connect issue
<ysionneau> good to know!
<whitequark[cis]> also if you reverse source/sink by accident it will. silently do nothing
<whitequark[cis]> and you'll spend a hour looking at verilog trying to figure out wtf has happened
<ysionneau> I had an issue where I put A.connect(B) instead of B.connect(A) just today and it didn't work well ^^
<ysionneau> so I've put this https://paste.centos.org/view/241a11da
<tpb> Title: Untitled - Pastebin Service (at paste.centos.org)
<ysionneau> now the critical path is gone
<ysionneau> but I don't know if it's because it's really gone or if I fucked up and all my path is optimized out =)
<ysionneau> hmhm it seems I fucked up, I only see 0s, everything has been optimized out
<whitequark[cis]> that is not correct
<whitequark[cis]> please look at the pseudocode i gave you
<whitequark[cis]> the reason it's all optimized out is because you connected ready backwards
<ysionneau> outch
<ysionneau> yeah now it works :)
<ysionneau> thanks a lot!
<whitequark[cis]> you do need a conditional there
<whitequark[cis]> if you haven't put one there yey
<ysionneau> went from 109 MHz to 121 MHz now!
<whitequark[cis]> s/yey/yet/
<ysionneau> ah
<whitequark[cis]> you will have data loss
<ysionneau> ok I think I understand now
<whitequark[cis]> i recommend reading https://amaranth-lang.org/docs/amaranth/latest/stdlib/stream.html which actually explains the stream invariants
<tpb> Title: Data streams — Amaranth language & toolchain 0.6.0.dev98 documentation (at amaranth-lang.org)
<ysionneau> thanks for the link
<whitequark[cis]> it's a little different from litex in syntax but it is the same concept
<ysionneau> whitequark[cis] | if(~valid | ready) < which valid and ready those are?
<ysionneau> I must say I'm not connecting a source to a sink but a source to a source right now (module encapsulation)
<ysionneau> I'm connecting the fifo.source from my module to the source of the module itself
<ysionneau> (to make it clear)
<whitequark[cis]> src.ready.eq(dest.ready); if(~dest.valid | dest.ready) dest.data.eq(src.data), dest.valid.eq(src.valid) else dest.valid.eq(0)
<whitequark[cis]> it's more or less the same for source to source if you add a pipeline stage
<whitequark[cis]> imagine "inverting" the module ports
<whitequark[cis]> like, inverting the direction. now it becomes a sink
<ysionneau> I see
<whitequark[cis]> whitequark[cis]: here, dest.data becomes a sort of one element FIFO, with dest.valid being its level
<whitequark[cis]> it's a useful pattern to remember for performance issues
<ysionneau> hmm now something looks broken
<ysionneau> I've put this https://paste.centos.org/view/5c1c14dc
<tpb> Title: Untitled - Pastebin Service (at paste.centos.org)
<ysionneau> now it's like nothing goes down the pipeline anymore, not even 0s
<whitequark[cis]> last line with ready assignment must be in comb
<whitequark[cis]> rest of it looks good
<ysionneau> hmm still no luck
<ysionneau> let's simulate this
<whitequark[cis]> oh hold on
<whitequark[cis]> remove the else branch?
<ysionneau> works :)
<ysionneau> thanks a lot, you're the fpga master o/
<ysionneau> yosys still instanciates a NOREG DP16KD but anyway since there is a FF behind my critical path is still shorter
<ysionneau> but the ram could be even faster with OUTREG ^^