<geist>
i'm guessing SYNCI is much like an ISB on arm, which is to say it acts as an instruction barrier to make sure everything before is committed
<geist>
useful after doing things that futz with icaches and whatnot
<geist>
FENCE.I in riscv
<Ameisen>
well, the mandated sequence is: SYNCI... (for each cache line to invalidate), sync, JR.HB/JALR.HB
<geist>
okay, so yeah synci must do as you say, invalidate
<geist>
but yeah having a dance to keep the cpu from prefetching across the cache line invalidate is normal for everything but x86
<Ameisen>
what's confusing is that R6 mandates that SYNCI invalidations be globalized.
<Ameisen>
so... what does SYNC do...
<geist>
SYNC keeps the local cpu from prefetching new stuff
<geist>
ie, makes sure that the next instructions after it are not using old icache entries
<geist>
and possibly waiting for the cpu to complete the flush
<geist>
actually possibly the second, and the JALR.HB is the barrier
<geist>
for ARM64 for example this would be `DC IVA <address>; DSB; ISB`
<geist>
the DC IVA is the flush, DSB waits for the flush to finish locally and globally, ISB keeps the cpu from prefetching ahead of it
<Ameisen>
well, it's unpredictable if the effective address of SYNCI references any cache line that contains instructions to be executed between SYNCI and the subsequent JALR/JR.HB - that is to say that until the hazard barrier is called, the state of those instructions to be executed is undefined
<Ameisen>
well, unpredictable
<geist>
yah, so the SYNC is probably 'wait for it to complete locally and globally'
<geist>
if you think of the SYNCI has being asynchronous
<Ameisen>
the hazard barrier I guess then flushes the pipeline?
<geist>
yah
<geist>
also the cpu probably can't prefetch across it, so it effectively flushes in the sense that every instruction in front if it is drained before it's allowed to continue
<geist>
in the more superscalar/out of order way of thinking
<Ameisen>
Full visibility of the new instruction stream requires execution of a subsequent SYNC instruction, followed by a JALR.HB, JR.HB, DERET, or ERET instruction. The operation of the processor is UNPREDICTABLE if this sequence is not followed.
<Ameisen>
""
<geist>
makes sense
<geist>
and in that case DERET and ERET are probably documented as being a serializing instruction
<Ameisen>
in a VM sense, I'm guessing that the hazard barrier is effectively a no-op.
<geist>
or whatever nomenclature they use
<Ameisen>
hazard clearing
<geist>
yep, if you're jut running things one at a time then they're a nop
<geist>
you can just stop the world at the SYNCI, do the flush, and probably ignore the SYNC as well
<Ameisen>
having the SYNC instruction do something is actually useful since I can concatenate SYNCI calls into ranges.
<geist>
that's the idea. you can issue as many SYNCI instructions as you want i'm guessing, the cpu is running them asyncronously and broadcasting them to other cpus
<Ameisen>
Yeah. Though as per the spec, I can just defer doing anything until SYNC
<geist>
and then the SYNC halts *the data fetching/etc portion of the cpu* until everything has completed
<geist>
yes
<geist>
like i said in ARM this is DC (data cache) and IC (instruction cache) instructions, and DSB (data sync barrier) acts as the wait-for-completion
<Ameisen>
There's still one annoying edge case that's specific to my logic I need to deal with - if a chunk, after invalidation, has a delay branch as the last instruction when it didn't before, that means that I technically need to invalidate the subsequent chunk as well since the first instruction there requires delay branch handling...
<geist>
but the cpu is allowed to keep prefetching, so you provide an ISB to block that (which is your .HB stuff)
<Ameisen>
ad infintum
<Ameisen>
yeah, they just word it to avoid mentioning those concepts
<Ameisen>
which both makes sense but is also confusing a bit.
<geist>
yah
<geist>
believe me it took a while to grok it from the ARM manuals too, they're equally wishy washy about it
<Ameisen>
I wish it weren't just unpredictable to execute an instruction that's on an invalidated cache line prior to a hazard barrier
<Ameisen>
I'd prefer to throw an exception since I have to handle it in some way now.
<Ameisen>
effectively, 'handle the case where a chunk invalidates itself while it's executing'
<Ameisen>
it's not hard to handle, but it's annoying and also somewhat difficult to test
<geist>
yeah that's all pretty much undefined on a real thing
<geist>
since it'sundefined you just stick in a system("rm -rf /");
<Ameisen>
it's UNPREDICTABLE rather than UNDEFINED
<geist>
yah arm calls that UNDEFINED CONSTRAINED i think
<Ameisen>
they're allowed to cause arbitrary exceptions, though, but I need to think over what that means in this context since instructions generally specify exactly what and when they can throw.
<geist>
well the way you figure it the cpu either sees one or the other, but it wont see some third thing
<Ameisen>
but it must not halt or hang the processor
<geist>
same with a SMP system. other cores either see the new or old cache line
<geist>
but if you're flushing a page any cpu is running at the time you're totally playing with fire
<Ameisen>
logically, though the way they define it, it COULD have any result, just so long as it's a valid result.
<Ameisen>
they're lax in their definition of it. It's just the 'arbitrary exceptions' part that throws me off
<Ameisen>
I can interpret that as "it could arbitrarily throw an exception" or "it can throw any exception"
<Ameisen>
regarding the delay branch issue, I _think_ I can just noop-pad after the first instruction and inject the delay branch logic in there as needed? A big no-op chunk isn't ideal but it's probably better than having to rewrite a chunk (and also invalidate all of its patches)
<Ameisen>
though honestly I'm unsure.
<geist>
yah i dunno the precise rules of the delay branch slot on mips
<geist>
i assume there's a lot of 'dont do that' or 'that's undefined'
<geist>
that you have to pick a reasonable thing
<Ameisen>
it's more that if I arbitrarily add a delay branch check all the time, without having any knowledge of the previous instruction, it's going to be the slowest possible kind of implementation.
<Ameisen>
whereas most of the time I've seen such branches, they've been branches that I can trivially resolve and turn into a simple `jmp`
<geist>
right, bad enough that even if you streamline the shit out of your interpreter you're always going to need to track if you're in a branch delay slot, etc
<geist>
on ARM32 the annoying thing was that any instruction can write to r15 (PC) which also acts as a branch
<Ameisen>
that sucks, though at least you know that it's doing it
<geist>
so in my little translator thingy, i could turn certain things into JMP opcodes, but i still had to test for 'did r15 get written to last cycle?'
<Ameisen>
if (dest == r15) this is really a branch
<geist>
yah
<Ameisen>
so I'd probably just emit it as a weird compact branch
<geist>
yah but its a weirc compact branch that may have an arbitrary data op instruction in front of it
<Ameisen>
delay branches are annoying, but their worst annoyance is as said that they can impact instructions that aren't in this chunk (aka 'cache line')
<geist>
like you can OR with the PC, etc
<Ameisen>
so invalidations can cause weird things
<Ameisen>
my generator has a 'delayed compact branch' thing, which injects compact branch logic after an instruction. Though it's only used if I have to call an instruction that's emulated instead (written in C++) that could potentially branch.
<Ameisen>
basically just checks if PC changed and if so, does an indeterminate compact branch
<geist>
yah
<Ameisen>
the worst kind of branch - indeterminate
<Ameisen>
has to enter emulated mode temporarily to fetch addresses.
<Ameisen>
that's why I'm very hesitant to just inject the worst delay branch check at the start of every chunk. Even if it's not taken, in the off-chance that a delay branch slot is the first instruction, I'd rather it can use a more optimal form.
<geist>
kinda makes me want to dig out my old mips board. i have exactly one and it's so slow
<geist>
CI20 iirc. something like that
<Ameisen>
re: unpredictable, what I think that they'd call the case where it could be the old or current instruction executed as 'UNSTABLE'
<Ameisen>
"Unlike UNPREDICTABLE values, software may depend on the fact that a sampling of an UNSTABLE value results in a legal transient value that was correct at some point in time prior to the sampling."
<geist>
it's so slow it's.... delicious
<geist>
almost retro
<Ameisen>
I don't think that any real MIPS32r6 hardware actually exists
TkTech9 has joined #osdev
TkTech has quit [Ping timeout: 272 seconds]
TkTech9 is now known as TkTech
<AmyMalik>
that's sad
agent314 has quit [Ping timeout: 276 seconds]
agent314 has joined #osdev
agent314 has quit [Max SendQ exceeded]
agent314 has joined #osdev
_whitelogger has joined #osdev
leoh has quit [Ping timeout: 276 seconds]
leoh has joined #osdev
leoh has quit [Ping timeout: 260 seconds]
agent314 has quit [Ping timeout: 260 seconds]
gildasio has quit [Remote host closed the connection]
gildasio has joined #osdev
beto has quit [Remote host closed the connection]
beto has joined #osdev
agent314 has joined #osdev
agent314 has quit [Max SendQ exceeded]
agent314 has joined #osdev
agent314 has quit [Max SendQ exceeded]
agent314 has joined #osdev
<Ermine>
heat: if nof freeing doesn't help, then it's not uaf?
<zid>
stale reference can still be stale even if you don't happen to have freed the other end
<zid>
just changes what happens
<zid>
if(f->num == 0) return free(f); f->num--; blah(f); doesn't get less crashy if you delete the free :P
<zid>
just changes the crash to someone trying to index UINT_MAX into an array somewhere
leoh has joined #osdev
leoh has quit [Ping timeout: 276 seconds]
leoh has joined #osdev
goliath has joined #osdev
leoh has quit [Ping timeout: 260 seconds]
<Ermine>
f->num-- is UAF in this case
<Ermine>
also I interpret "doesn't help" as "symptoms are the same"
<Ermine>
is it hard to implement page quarantine?
Lucretia has joined #osdev
jjuran has quit [Read error: Connection reset by peer]
jjuran has joined #osdev
jjuran has quit [Client Quit]
jjuran has joined #osdev
EmanueleDavalli has joined #osdev
agent314 has quit [Quit: WeeChat 4.5.2]
netbsduser has joined #osdev
sortiecat has joined #osdev
GeDaMo has joined #osdev
Lucretia has quit [Remote host closed the connection]
Lucretia has joined #osdev
xvmt has quit [Remote host closed the connection]
xvmt has joined #osdev
dutch has quit [Ping timeout: 252 seconds]
<heat>
Ermine: basically, this can be happening: page is allocated separately (page cache, whatever). page gets freed but still somehow in the system (still stuck in the pagecache or something). page gets reused as a stack page. someone writes to the page cache, or frees the page again. you ded.
<heat>
i only checked what happens if stack freeing doesn't free backing pages nor unmap anything. basically just leave it dormant in vmalloc space, KASAN poisoned
<heat>
as for page quarantine no it wouldn't be hard, i already have a quarantine for regular slab objects, this would basically extend it
dutch has joined #osdev
emanuele_davalli has joined #osdev
EmanueleDavalli has quit [Read error: Connection reset by peer]
sortiecat has quit [Ping timeout: 240 seconds]
sortiecat has joined #osdev
pabs3 has quit [Read error: Connection reset by peer]
pabs3 has joined #osdev
edr has joined #osdev
ZipCPU has quit [Ping timeout: 276 seconds]
ZipCPU has joined #osdev
jcea has joined #osdev
karenw has joined #osdev
sortiecat has quit [Ping timeout: 265 seconds]
gildasio has quit [Ping timeout: 244 seconds]
gildasio has joined #osdev
karenw has quit [Read error: Connection reset by peer]
<nikolar>
heat: did ia64 get removed from gcc 15?
<heat>
i dont know
<nikolar>
you're supposed to be the itanium fanboy here, smh
fluix has quit [Ping timeout: 276 seconds]
sortiecat has joined #osdev
fluix has joined #osdev
<Ermine>
is he?
<heat>
i'm a moderate fan
<heat>
i think it's interesting but lots of it is just for the memes
<heat>
nikolar: > GCC 15 Un-Deprecates Itanium IA-64 Linux Support
<heat>
so i guess that says it?
<nikolar>
oh cute
<nikolar>
> heat | i think it's interesting but lots of it is just for the memes
<nikolar>
couldn't agree more
<heat>
my favourite architecture is most likely x86, by far
<heat>
i appreciate arm64 but i don't (yet?) appreciate its details, and i definitely don't know it well enough
<heat>
riscv needs to become useful first
jcea has quit [Ping timeout: 248 seconds]
<nikolar>
wow i agree with that as well
<heat>
heat-nikolar agreeing streak: 2 (new record!)
<nikolar>
dang, we're on a roll
* kof673
throws a c++ stone, that ought to fix that
<kof673>
there's some chain reaction disagreement there
<kof673>
see, you can't say anything about c++ because it will ruin the streak :D
<bslsk05>
status.cloud.google.com: Google Cloud Service Health
<kof673>
c++ stone is just slowly burning from within, give it time
sortiecat has quit [Ping timeout: 260 seconds]
<nikolar>
Nice
<nikolar>
heat: let's go
goliath has quit [Quit: SIGSEGV]
scaleww has joined #osdev
leoh has quit [Ping timeout: 252 seconds]
leoh has joined #osdev
leoh has quit [Ping timeout: 252 seconds]
<heat>
i see a slight problem
<heat>
i can only repro this using KASAN
<heat>
my KASAN implementation has a weird oddity, where I map the same shadow page over many TB using CoW page tables, and only un-CoW these page tables when I actually allocate something there
<heat>
problem is that i'm not doing proper TLB maintenance when switching page tables around
<heat>
meaning that could _possibly_ be contributing to the unpredictable behavior i'm seeing
<sortie>
A contributor reported to me that "foo | less" might randomly abort in a wrong way.. I thought 'quick in-and-out 20 minute adventure' and ... I'm so far deep this rabbit hole, I have stared into the abyss
<sortie>
The only way to solve it was to read all the relevant language in the standard, write dozens of tests, run them everywhere, try to make sense of the data, and somehow figure out how a 'foo | bar | qux' pipeline is supposed to work with job control
Pixi has quit [Quit: Leaving]
Pixi has joined #osdev
<sortie>
You may go 'sortie that's simple lemme explain' and that's when armed people pull up in a van and drag you into it and then you have to listen to me explain obscure race conditions that can happen that nobody should have to think about while we drive away never to be seen again
<sortie>
All in all, a pretty good day of osdev :) 7/10 would recommend
<heat>
have you tried using linux
<nikolar>
heat: are you talking about suse in particular
<heat>
not in particular
<heat>
but, now that you mention it, it's a very good option
<zid>
Isn't that a firable offence
<heat>
no im uhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh on windows 8 right now
<nikolar>
why
<heat>
this metro layout is rad, big future
<nikolar>
why
<heat>
don't flood me with questions man i'm enjoying the upgrade from vista
<heat>
i skipped 7 because it's just a terrible system
<sortie>
They couldn't make a more unpleasant pattern even if they tried
<heat>
i think you're jelly
<sortie>
X11 pattern known to cause cancer
<heat>
that's what i think
<heat>
sortix no x11
<sortie>
wait
<heat>
onyx not really x11 but have toy port
<sortie>
heat: you have x11?
<heat>
kind of
<sortie>
im sorry for your loss
<geist>
really? I kinda like the monochrome hash pattern
<nikolar>
heat: what exactly did you port
<geist>
the default one on mac classic is nice too
<nikolar>
geist: disregard sortie, he's a party pooper
<sortie>
The pattern is so weird on my screens, especially when it moves, and it fucks up my eyes
<sortie>
I will poop your party
<nikolar>
get better eyes, duh
<heat>
nikolar: the xserver packages, appropriated xorg video drivers (fbdev), base x11 libs
<zid>
most LCDs are too trash for a pattern like that yea
<nikolar>
not bad
<heat>
i think i have keyboard input too, but no mouse (i don't even have mouse drivers)
<sortie>
(parents walking past telling their children not to pay attention to the crazy shouting man outside the university of osdev)
<nikolar>
heat: how badly does it worko
<nikolar>
s/o$//
<bslsk05>
<nikolar*> heat: how badly does it work
<heat>
nikolar: i don't know, i need a client
<heat>
which i don't have
<nikolar>
kek
<heat>
xterm is a PITA and needs more packages
<nikolar>
didn't you at least build xterm, or xclock or something
<heat>
and at that point i need to just do rpms for these
<heat>
because this is all using the old build system
<heat>
that cross-compiled
fedaykin has quit [Quit: leaving]
<heat>
for now, i need other stuff, though
<heat>
gpg is my next great goal
<nikolar>
so you can sign your packages?
<heat>
i can already sign them from linux but yeah
<nikolar>
ok, verify your packages
<heat>
yeah
<heat>
i also totally need rustc
<heat>
like, kind of an emergency
<heat>
rpm package sign verification depends on a rustc library
fedaykin has joined #osdev
<heat>
the thing you're supposed to use, compared to some broken gpg thing they have or whatever
<heat>
so I'll need to figure out a cross-compilation of rustc again, and then rebuild under onyx
<nikolar>
lol
<nikolar>
another reason not to use rpm
<nikolar>
i guess
<heat>
rpm is great actually
<nikolar>
a rust dep isn't great, no
<heat>
if you're doing security stuff, i understand rust
<nikolar>
lol
<heat>
it's a pain in the ass for bootstrapping, but literally who does this
<heat>
i'm implementing multi-threaded coredumps because git clone is crashing :/
<nikolar>
kek
<ZetItUp>
osdev wiki is kinda slow, so i ask here quick, if i double fault after enabling interrupts it's cause the IRQs are mapped wrong right?
<heat>
not necessarily
<heat>
but possibly
<zid>
a double fault is the weirdest possible outcome
<nikolar>
qemu should tell you what happens
<heat>
do your traps work?
<zid>
no fault is normal, triple fault is normal, double means you took a fault inside the first fault, but *that* handler ran okay, are you certain you don't proceed to triple fault?
<heat>
e.g __builtin_trap(); and check if it ends up in a plausible exception
<ZetItUp>
yeah im getting 0x8 exception in the qemu log
<zid>
breakpoint your double fault handler instead
<zid>
it *delivering* a double fault isn't the same as you processing one
<heat>
what comes before the 0x8?
<zid>
my money's on a full triple fault
<ZetItUp>
i've tried to debug and i guess it's my IDT asm code bah
EmanueleDavalli has quit [Quit: Leaving]
<ZetItUp>
or my stack is f'd, im when i try to return the IP goes to 0x00000003 :P
<ZetItUp>
english good.
<zid>
don't do that ^
Left_Turn has quit [Read error: Connection reset by peer]
<nikolar>
lol
Lucretia has quit [Remote host closed the connection]
leoh has quit [Ping timeout: 252 seconds]
xenos1984 has joined #osdev
cow321 has quit [Read error: Connection reset by peer]
leoh has joined #osdev
cow321 has joined #osdev
itrsea has quit [Remote host closed the connection]