00:00
tpb has quit [Remote host closed the connection]
00:01
tpb has joined #yosys
03:06
_whitelogger has joined #yosys
05:08
_whitelogger has joined #yosys
05:34
mithro has quit [Ping timeout: 272 seconds]
05:38
mithro has joined #yosys
08:03
krispaul has joined #yosys
08:04
kristianpaul has quit [Ping timeout: 240 seconds]
10:00
_whitelogger has joined #yosys
12:55
xutaxkamay_ has joined #yosys
12:55
xutaxkamay has quit [Read error: Connection reset by peer]
12:56
xutaxkamay_ is now known as xutaxkamay
20:44
krispaul has quit [Quit: WeeChat 3.5]
20:45
kristianpaul has joined #yosys
22:13
nonchip has joined #yosys
22:39
<
ysionneau >
hi, I'm trying to understand a timing report from nextpnr here:
https://paste.centos.org/view/44380199 the Info: Critical path report for cross-domain path 'posedge $glbnet$ecp5pll_clkout0' -> '<async>':
22:39
<
tpb >
Title: Untitled - Pastebin Service (at paste.centos.org)
22:39
<
ysionneau >
I can see the path starts with: clk-to-q 4.26 4.26 Source storage_3.0.0.DOB4
22:40
<
ysionneau >
storage_3.0.0.DOB4 is a DP16KD (dual ported ram) of ECP5
22:41
<
ysionneau >
well, that's storage_3.0.0 and I guess DOB4 is the DataOut port B bit 4
22:42
<
ysionneau >
clk-to-q is the time for the DP16KD to output data after a clock edge? is it normal it's so slow?!
22:42
<
ysionneau >
it's eating so much of my timing
22:43
<
ysionneau >
I'm trying to find this information in Lattice datasheet but all I find is wave diagram and this timing annotation : tCOO_EBR
22:43
<
ysionneau >
but they never say how much tCOO_EBR is
22:48
lofty[m] has joined #yosys
22:48
<
lofty[m] >
ysionneau: what's your nextpnr command line?
22:49
<
lofty[m] >
I should point out that "<async>" usually implies "the outside world", through I/O or such.
22:53
<
ysionneau >
yes the end of this route is an IO
22:54
<
ysionneau >
but the remaining part of this (critical) path seems quite normal
22:54
<
ysionneau >
I'm just afraid by this 4.26 ns which is enormous
22:55
<
ysionneau >
hmm maybe it's because it's in "NOREG" mode (output is not registered), I get "REGMODE_A": "NOREG", and same for B
22:57
<
ysionneau >
it's sad because I instanciated a LiteX SyncFifo with buffered=True which I think is meant to mean "please put a reg at output"
22:58
<
ysionneau >
but it seems yosys did not understand something and put a NOREG DP16KD
23:04
<
ysionneau >
hmhm it seems indeed the verilog does not seem
*that* buffered, at all :o
23:04
<
ysionneau >
so I guess it's a LiteX or migen issue
23:05
<
ysionneau >
lofty[m]: cmdline is nextpnr-ecp5 --json hydrasucrela.json --lpf hydrasucrela.lpf --textcfg hydrasucrela.config --12k --package CABGA381 --speed 8 --timing-allow-fail --seed 1
23:05
whitequark[cis] has joined #yosys
23:05
<
whitequark[cis] >
SyncFIFOBuffered puts one register at the output of an async memory
23:05
<
whitequark[cis] >
if you want one more pipeline stage you have to add it yourself
23:06
<
whitequark[cis] >
AsyncFIFOBuffered however adds a pipeline stage on top of an existing sync memory
23:10
<
tpb >
Title: Untitled - Pastebin Service (at paste.centos.org)
23:11
<
ysionneau >
iiuc the fact that it's read in a always @posedge XXX block means the output is buffered?
23:11
<
whitequark[cis] >
that looks to me like a normal sync memory macro, with one cycle latency
23:11
<
ysionneau >
shouldn't yosys then generate a OUTREG DP16KD ?
23:11
<
ysionneau >
or I'm missing something
23:12
<
whitequark[cis] >
a BRAM on an FPGA always has registered output
23:12
<
whitequark[cis] >
i'm not deeply familiar with ECP5 but i'm fairly sure that "NOREG" means 1 cycle of latency and "OUTREG" means 2 cycles
23:13
<
whitequark[cis] >
DP16KD will never be fully async, no FPGA BRAM macro is
23:14
<
ysionneau >
ok so maybe my understanding issue is what is an asyncfifo vs a syncfifo
23:15
<
whitequark[cis] >
SyncFIFO has reader and writer in the same clock domain, AsyncFIFO has them in potentially different domains
23:15
<
whitequark[cis] >
non-buffered SyncFIFO in Migen uses async memory primitives (LUTRAM or FFRAM)
23:16
<
whitequark[cis] >
distrubuted RAM in Xilinx lingo
23:16
philtor has quit [Remote host closed the connection]
23:16
<
ysionneau >
which can be quite slow I guess when it's big
23:16
<
whitequark[cis] >
everything else uses BRAM
23:16
<
whitequark[cis] >
quite, yes
23:16
<
ysionneau >
so in my case read and write ports are in the same clk domain
23:17
<
ysionneau >
so I guess I am OK to use a SyncFIFO
23:17
<
whitequark[cis] >
you want SyncFIFOBuffered
23:17
<
ysionneau >
I've put buffered=True
23:17
<
whitequark[cis] >
there is almost never any reason to use the non buffered version
23:17
<
whitequark[cis] >
(in fact it has an incompatible interface...)
23:18
<
ysionneau >
maybe I'm reading this wrong but isn't SyncFIFOBuffered the same as SyncFifo(buffered=True) ?
23:19
<
whitequark[cis] >
it is iirc (been many years since i touched migen but i don't think it changed much)
23:20
<
ysionneau >
so you think I need to add yet another level of buffering after?
23:20
<
ysionneau >
so that yosys put a OUTREG DPK16KD
23:21
<
ysionneau >
DP16KD*
23:21
<
whitequark[cis] >
adding a pipeline stage would be a solution to slow clk-to-q yeah
23:21
<
whitequark[cis] >
whether it will be folded into DP16KD im not sure
23:21
<
whitequark[cis] >
but it should improve timing either way
23:21
<
ysionneau >
sure it would cut my critical path
23:23
<
ysionneau >
is there an easy way to put a "buffer" for all signals in between two pipeline elements in the stream.* api?
23:23
<
ysionneau >
source/sink api I mean
23:24
<
whitequark[cis] >
thats a litex thing, i never learned litex
23:24
<
whitequark[cis] >
but it is like 3 lines of code
23:24
<
ysionneau >
because I can't just do sync += [self.source.data.eq(fifo.source.data)] I should also buffer .valid
23:25
<
ysionneau >
well you're right if it's just data and valid ... it's 2 lines
23:26
<
whitequark[cis] >
if(~valid | ready) dest.data.eq(src.data), dest.valid.eq(src.valid) else dest.valid.eq(0)
23:27
<
whitequark[cis] >
all sync
23:28
<
ysionneau >
I've moved self.packet_buffer_pipeline.source.connect(self.source) from a comb to a sync block, I think it's ok
23:28
<
ysionneau >
let's try
23:30
<
whitequark[cis] >
no that will be totally broken
23:30
<
whitequark[cis] >
migen Record.connect is fucked by design, you should never put it in sync
23:32
<
whitequark[cis] >
we redesigned it in amaranth to not be a complete disaster
23:32
<
whitequark[cis] >
and a part of that is making it impossible to put it in sync
23:34
<
ysionneau >
I didn't know about the Record.connect issue
23:34
<
ysionneau >
good to know!
23:35
<
whitequark[cis] >
also if you reverse source/sink by accident it will. silently do nothing
23:36
<
whitequark[cis] >
and you'll spend a hour looking at verilog trying to figure out wtf has happened
23:36
<
ysionneau >
I had an issue where I put A.connect(B) instead of B.connect(A) just today and it didn't work well ^^
23:36
<
tpb >
Title: Untitled - Pastebin Service (at paste.centos.org)
23:36
<
ysionneau >
now the critical path is gone
23:37
<
ysionneau >
but I don't know if it's because it's really gone or if I fucked up and all my path is optimized out =)
23:38
<
ysionneau >
hmhm it seems I fucked up, I only see 0s, everything has been optimized out
23:38
<
whitequark[cis] >
that is not correct
23:38
<
whitequark[cis] >
please look at the pseudocode i gave you
23:39
<
whitequark[cis] >
the reason it's all optimized out is because you connected ready backwards
23:40
<
ysionneau >
yeah now it works :)
23:40
<
ysionneau >
thanks a lot!
23:41
<
whitequark[cis] >
you do need a conditional there
23:41
<
whitequark[cis] >
if you haven't put one there yey
23:41
<
ysionneau >
went from 109 MHz to 121 MHz now!
23:41
<
whitequark[cis] >
s/yey/yet/
23:41
<
whitequark[cis] >
you will have data loss
23:42
<
ysionneau >
ok I think I understand now
23:43
<
tpb >
Title: Data streams — Amaranth language & toolchain 0.6.0.dev98 documentation (at amaranth-lang.org)
23:43
<
ysionneau >
thanks for the link
23:43
<
whitequark[cis] >
it's a little different from litex in syntax but it is the same concept
23:44
<
ysionneau >
whitequark[cis] | if(~valid | ready) < which valid and ready those are?
23:45
<
ysionneau >
I must say I'm not connecting a source to a sink but a source to a source right now (module encapsulation)
23:45
<
ysionneau >
I'm connecting the fifo.source from my module to the source of the module itself
23:45
<
ysionneau >
(to make it clear)
23:45
<
whitequark[cis] >
src.ready.eq(dest.ready); if(~dest.valid | dest.ready) dest.data.eq(src.data), dest.valid.eq(src.valid) else dest.valid.eq(0)
23:46
<
whitequark[cis] >
it's more or less the same for source to source if you add a pipeline stage
23:46
<
whitequark[cis] >
imagine "inverting" the module ports
23:46
<
whitequark[cis] >
like, inverting the direction. now it becomes a sink
23:48
<
whitequark[cis] >
whitequark[cis]: here, dest.data becomes a sort of one element FIFO, with dest.valid being its level
23:48
<
whitequark[cis] >
it's a useful pattern to remember for performance issues
23:52
<
ysionneau >
hmm now something looks broken
23:53
<
tpb >
Title: Untitled - Pastebin Service (at paste.centos.org)
23:54
<
ysionneau >
now it's like nothing goes down the pipeline anymore, not even 0s
23:54
<
whitequark[cis] >
last line with ready assignment must be in comb
23:54
<
whitequark[cis] >
rest of it looks good
23:55
<
ysionneau >
hmm still no luck
23:55
<
ysionneau >
let's simulate this
23:56
<
whitequark[cis] >
oh hold on
23:56
<
whitequark[cis] >
remove the else branch?
23:57
<
ysionneau >
works :)
23:57
<
ysionneau >
thanks a lot, you're the fpga master o/
23:59
<
ysionneau >
yosys still instanciates a NOREG DP16KD but anyway since there is a FF behind my critical path is still shorter
23:59
<
ysionneau >
but the ram could be even faster with OUTREG ^^