-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Netpoller for noos #12
Conversation
4b9fb31
to
3c9e5c2
Compare
Since rtos.Note can be woken up multiple times, I also wan't to propose to make rtos/Note.Clear() return the current state of the Note, see 1706d44. I think this is considered an API breaking change, maybe add something like TryClear() instead. This will be helpful for interrupts that can trigger independently. Otherwise there is no way to do this race-free. |
@michalderkacz As you suggested via email, I had a look how we could allow for a goroutine to sleep on multiple notes/pollfds. The difficulty I see in this is, that it probably breaks the atomics based sync mechanism that I adapted from I propose to merge this after I cleaned it up and you approve the changes made, as a transparent change without API changes to rtos.Note(). I thought deeply about the current implementation and tested it to a fair amount. I currently have no real-world needs to sleep on multiple pollfds, and as long as it's not performance relevant, it can be implemented at higher levels. If the need arises, we can built upon this in a new PR. |
I prefer to leave the current Note implementation as it is for now, without any changes. I've some things that rely on it and want move slowly to any new mechanism. Let it live for a while next to the new netpoll based type (say Event). We can then play with the new Event type for a while, allowing all (mostly my own) existing code to be compiled with new version of Go and still use the old notes or revert to them in case of any problem. The current Note implementation is also the only possible mechanisms to communicate with the raw tasks because they are out of the Go scheduler scope and I want them to be available at least for a while.
The lack of such select like mechanism is bothersome. It turns around the natural way to do things. If you want a goroutine to wait for two events at the same time the higher level approach requires two additional goroutines with two additional notes or run your event loop using two goroutines with a mutex (this last approach may be problematic if your event loop is also a state machine). As this muti-goroutine approach is almost unacceptable (for me) you end up with a somehow reversed interface that the receiver must give its note to both senders (event sources) instead to simply wait on two event-like variables published by senders. If there are multiple entities interested in one event, the event source must maintain a collection of their notes. I haven't encountered this last case (multiple receivers of one event) in practice yet but the case with the need to wait for two events is quite common. But let's give up this select like behavior for now and start with your current implementation. Short TODO list for this PR:
|
I do now see that it's a bad idea to change the Note type. While the interface is the same, the new implementation differs in subtle ways. I will create a new type and check if I still need the additional pdClear state, that was added to get the same semantics as the old Note type. I haven't tried, but given the Note type you might get select-like behaviour with an additional type, let's call it Implementing this behavior directly in the netpoller should, in my opinion, only be done if it's necessary due to synchronization or performance. I'm probably missing something, but if you think this approach might work for you I can give it a try when cleaning up the PR. |
Your proposal is interesting. I see it more like a tree of events so you can register other event variables to any event variable. But the correct implementation of it if you allow also unregister operations will be difficult because our event sources are mainly IRQ handlers (we have Go channels for non-IRQ stuff) which enforces you to use lockless lists of events and my experiences with such types are that they are very difficult to implement and not as lightweight as you expect for such IRQ use case. Instead, I have good experiences with non-precise events (see them hash tables reduced to hash only, compared to the described above precise event trees). Let's give up it for now and simply implement a new synchronization mechanisms similar to note in the way that it remembers the fact of Signal/Send/Wakeup/etc. call for upcoming Recv/Wait/Sleep/etc. The detailed behavior and naming is in your hands so indeed it may be more optimized and lightweight than any possible simulation of the runtime.note. |
I rebased on the master-embedded and moved the implementation to The new type is called rtos.Cond and provides only two methods: Signal and Wait, with Wait implicitly clearing the Cond. This basically represents a binary semaphore, which is also the main sync primitive in FreeRTOS for ISRs. It also comes naturally out of a minimal netpoller, which simplified the implementation a bit. A clear can still be done by calling Wait(0). This also returnes the Cond state, as we discussed previously this might be needed in some cases. I will go again over my comments, TODOs, FIXMEs.. But if you want to have a look already I would be interested in you comments about (1) the changes made to thetasker from a SMP perspective and (2) the new Cond type from a unprecise-events perspective. I had difficulties fully grasping what you said about unprecise-events and couldn't really find online resources to improve my understanding. |
This solution is copied from netpoll.go
I've written a small example for Pi Pico that uses About the Clean operation, I imagine much of this in my code: d.dma.SetSrc(buf.Addr())
d.p.CleanIRQs(mask)
d.cond.Wait(0) // clean cond before enable interrupts and start DMA
d.dma.Start()
d.p.EnableIRQs(mask) but maybe I'm wrong. I like minimal interfaces so we can try it as is. The Clean method may be added in the future if we get too much such comments explaining what the The Clean method is useful to clear all spurious signals that may occur before you setup everything. This is especially possible if you have for example two ISRs that signal on the same cond and you wait for both at the same time (my "reverse interface" case, you explicitly allowed it). It's also useful if Clean has the "release" or "publication barrier" semantic to synchronize the The separate Clean method makes also clear that some signals may be missed between the return from Wait and the Clean call. The same can be said about the current Wait semantic that consumes the signal but it isn't as obvious. |
Can you explain this question more? Do you mean last changes I did when added support for Pi Pico?
From "unprecise-events" perspective the
I didn't read about them anywhere. Maybe the hardware support for WFE instruction in ARM was my inspiration to invent them (I really think I didn't invent them; they probably exists under different name somewhere). So let's explain this concept starting from the extremely non-precise case. Let's goroutines have an event bit in their G struct that informs about the occurence of an event. The interface is like this. func RegisterEvent() // registers the gotoutine to wait for events
func UnregisterEvent()
func SendEvent() // sets the G.event bit for all registered goroutines, wakes-up these that wait for it
func WaitEvent(deadline) bool // wait for G.event != 0, clears it at exit The simplest, correct but inefficient implementation is no-op for three from these functions and almost no-op for WaitEvent (it must check the deadline). Such implementation reduces any algorithm that uses Event to polling. Our new type Cond struct { v atomic.Bool }
func (c *Cond) Signal() {
c.v.Store(true)
SendEvent()
}
func (c *Cond) Wait(timeout time.Duration) bool {
var deadline time.Time
if c.v.Load() {
goto end // fast path
}
if timeout == 0 {
return false
}
if timeout > 0 {
deadline = time.Now().Add(timeout)
}
RegisterEvent()
defer UnregisterEvent()
for !c.v.Load() {
if !WaitEvent(deadline) {
return false
}
}
end:
c.v.Store(false)
return true
} As you can see, it's extremely imprecise. Any Signal call on any Cond variable wakes up all registered goroutines. But it can work reasonably well if the number of groutines that sleep at the same time waiting for an event is small (say 3-4) and the Signal calls are infrequent. What can be done to improve this situation? The one-bit type Event uint
cons genMask = unsafe.Sizeof(uint)*8 - 1
var eventGen atomic.Uint
func AllocEvent() Event { return 1<<(eventGen.Add(1)&genMask) }
func (e Event) Register() // Event(0).Register() can be used to unregister
func (e Event) Send()
func WaitEvent(deadline) bool // waits for registered events Until the number of allocated events is greater than 32 (64 on 64-bit machine) these events are precise. This seems to be fine for our interrupt use case where we have a limited number of IRQs. We must accept some growing inefficiency if the number of allocated and actively used events is becoming greater than the number of event bits. The type Cond struct {
v atomic.Bool
e Event
} You can probably see the implementation of all its methods. The hardest thing to implement efficiently in case of such non-precise events seems to be the Register and Send methods. Simple implementation may maintain single list of registered goroutines without paying attention to the Event they register for. The netpoller can have the I don't want to go deeper into possible implementation details (I've already gone too far). I think you already have the picture and see how it fits (or nor) into the netpoller concept. |
I currently have a lot of Wait(0) in my code, but I'll have to review if they are really needed. If there are spurious interrupts after things are setup correctly, there is probably already a bug. A problem with rtos.Note is that it's designed for one-time notifications. If we allow to wake them multiple times, as you said there is the chance to miss interrupts between Sleep and Clear. Making Sleep and Clear an atomic operation in Wait prevents this possibility. I think of a mailbox interrupt that notifies about available data. This also clarifies (your intuition was right) why Clear doesn't return the current value via atomic Cas, as it encourages false usage of Note. I also think the minimal interface is in line with Go idioms. I see this more like a sync primitive, like futexes, which can be used to build more specific sync types that fit your needs better. These wouldn't necessarily need to live in embedded/rtos. You comment about unprecise events made things clearer for me, thanks for clarifying! What I specifically meant was, if you think it's possible to implement the API you described using the Cond type. But I see you think about it the other way rount and probably it's not important right now, let's rather get some hands on experience with the new Cond type. Regarding SMP, I'm interested if you see any issues if waking the netpoller from another CPU. PS: A PR with WIP prefix is considered not ready for merge. I will open another PR with some changes I was about to push. |
Makes the wakerq global and consumed by a minimal netpoller implementation. The netpoller uses the same sync mechanics as the common netpoller.go implementation. It has one additional state that protects against false wakeup, if a note is enqueued during a Clear.
This is a performance optimization. The current rtos.Note implementation goes into a blocking syscall on every Sleep call, causing an expensive handoff (~800us CPU time on N64). Using the netpoller instead allow to use an ordinary goroutine sleep instead.
The need for this arised while implementing a driver for an DMA controller. Interrupts might trigger every few milliseceonds, which is too long to wait but also too short to sleep on a Note if it eats almost a whole millisecond of CPU time.
TODO