#avr | Logs for 2016-09-23

Back
[02:26:09] <Levitator> Have an idea for a stream library that works like unix pipes.
[02:26:43] <Levitator> Were you have a sequence of super-simple stream objects connected together by pipe objects, and as the program state advances, it rearranges the pipes to obtain different behaviors.
[02:27:45] <Levitator> You put bytes in one end, and eventually something else comes out the other end. When that happens, then you plug the pipes together some other way to read something different.
[02:44:09] <jacekowski> your explanation is nothing but simple
[02:44:57] <Levitator> I like simple.
[02:45:45] <Levitator> Seems well-suited to an MCU environment where you want to have pre-allocated building blocks with well-defined an deterministic state change costs.
[02:59:02] <Thrashbarg> Levitator: UNIX pipes are a great concept. I tried something similar ages ago but didn't get too far with it lol
[02:59:15] <Levitator> UNIX is vastly overrated.
[02:59:22] <Levitator> Sorry, underrated.
[02:59:26] <Thrashbarg> lol
[02:59:27] <Levitator> The opposite of overrated.
[02:59:30] <Thrashbarg> yea :P
[03:00:22] <Levitator> Pipes are a brilliant concept that combines both modularity and high performance through parallelism.
[03:00:50] <Thrashbarg> just a shame FOSS freaks and associated yuppy larvae have bastardised it :P
[03:01:17] <Levitator> So, you have all these processes that can be effortlessly recombined in effortless ways, and they can all run on different cores.
[03:01:46] <Levitator> Damn, my brain is just not working this morning.
[03:02:20] <Thrashbarg> have you looked into LISP much?
[03:02:28] <Levitator> This MCU pipe design, however, is not so much to emphasize parallelism, but just modularity.
[03:02:33] <Thrashbarg> yeah
[03:02:36] <Levitator> I've played with LISP briefly.
[03:03:03] <Thrashbarg> I fiddled around with a similar idea only the result looked like LISP more than UNIX pipes
[03:03:20] <Thrashbarg> each function could be spawned on any processor
[03:03:45] <Thrashbarg> but yea, same story, didn't bother looking into it much
[03:03:49] <Thrashbarg> brb
[03:37:42] <Levitator> Yes, this is working out very well. I like it.
[03:38:22] <Levitator> So you can have a device which is piped, by default to a ringbuffer, and when you plug in something else, it flushes the ringbuffer into that pipe, then pulles the ring out from in between them and they're directly connected.
[03:39:37] <Emil> I still can't bloody figure out what is going on with the device
[03:40:35] <Levitator> So, maybe you have: device | bit_shifter | line_parser | record_handler
[03:41:19] <Levitator> Now, you decide you want to suspend the pipe for a time-critical operation, so you just do: device | bit_shifter | ring_buffer
[03:41:56] <Levitator> And it buffers stuff for later when you are ready. Then you plug the original pipeline into the ring buffer, empty it, then plug it back the way it was originall to proceed.
[04:07:12] <Levitator> Or just make the ringbuffer smart enough to know when its exit is connected, and then it's even simpler.
[05:40:16] <skz81> Leviator : you can still have a read/write like interface, with function pointers... You set the ring buffer output pointer to something instead of NULL, he know it's output is connected to something, and can call it to empty itself
[05:40:53] <skz81> But you'll need some scheduler-like mecanism to switch between "components"
[05:40:53] <Levitator> It's more like a linked list.
[05:42:16] <skz81> Leviator, you potentially need a pointer to identify the instance of the component, but also a pointer to actual implementation (just like a this and a virtual method in C++)
[05:42:54] <Levitator> It's implemented as a Pipe base class that has an in() function, and it holds a pointer to another Pipe instance whose in() it calls with its output, forming a linked list.
[05:43:33] <Levitator> The function that sets the destination pipe is virtual, so you can override it and set a behavior to perform when linking to a new outlet.
[05:44:05] <Levitator> So, if your Pipe is a Ringbuffer, and you set its output non-null, then it knows to flush itself.
[05:44:45] <skz81> We agree, then. The rest is implementation details, you assume C++, okay. Anyway your virtual method actually IS a pointer :)
[05:45:26] <skz81> If you have only one instance of each component, then you can define instances as static data in the functions
[05:45:41] <skz81> (just another possible design, with a strong assumption added)
[05:46:24] <Levitator> You don't need a scheduler. You can just make the last element in the pipeline reconfigure the pipeline when it receives its input.
[05:47:03] <Levitator> And the functions to alter the pipe links are atomicised to prevent races with ISRs.
[05:48:28] <Levitator> Come to think of it, you can just check for reentrance and bail out after updating the ringbuffer, if you find an active operation running.
[05:48:39] <Levitator> So, only the ringbuffer has to be atomicised.
[05:49:45] <Levitator> So, you update the ringbuffer, enable interrupts again, then flush the ringbuffer with interrupts enabled.
[05:50:31] <Levitator> Then if a nested interrupt happens, it updates the ringbuffer, sees that there is already an ISR running, and simply returns, letting the first one consume all of the queued data.
[05:53:06] <skz81> <Levitator> You don't need a scheduler. >>ho right, total crap from myself, you always have have a call-flow. I was thinking to UNIX pipe actually, that are buffers by themselves, and need the client to wake to be consumed
[07:01:03] <Levitator> Hrm. No fetchadd instruction on the AVR?
[07:02:54] <Lambda_Aurigae> don't think so.
[07:03:04] <Lambda_Aurigae> look at the avr instruction set...it has everything.
[07:03:21] <Levitator> I'm looking at it now, but it's not really indexed. It's a pain to search.
[07:03:25] <Lambda_Aurigae> http://www.atmel.com/images/Atmel-0856-AVR-Instruction-Set-Manual.pdf
[07:03:35] <Lambda_Aurigae> pdf search is easy.
[07:04:35] <Lambda_Aurigae> but, avr core is technically a RISC processor so it has limited commands.
[07:06:19] <Levitator> Fetchadd is important for lockless atomicity.
[07:07:56] <Levitator> Ah, there's a summary table. Great.
[07:08:34] <Lambda_Aurigae> sounds to me like you are making something more complex than it needs to be really.
[07:09:20] <Lambda_Aurigae> closest you will have there is ADIW..
[07:09:30] <Lambda_Aurigae> add immediate to word.
[07:09:52] <Lambda_Aurigae> adds a number (0 to 63) to a register pair.
[07:10:03] <Lambda_Aurigae> and places the result in said register pair.
[07:10:20] <Levitator> It's more complex but it's more performant because it's one instruction and it doesn't block interrupts.
[07:10:52] <Lambda_Aurigae> not available on all devices though.
[07:11:09] <Lambda_Aurigae> atmega only as I recall.
[07:11:11] <Lambda_Aurigae> not on attiny
[07:11:58] <Levitator> Updating a register pair is useless because the load to get the data into the registers is a separate operation.
[07:12:14] <Lambda_Aurigae> then switch to x86
[07:13:04] <Levitator> Do they sell x86 single-board computers for $3?
[07:13:14] <Lambda_Aurigae> not that I know of.
[07:13:23] <Lambda_Aurigae> but the avr doesn't have the command you want.
[07:13:52] <antto> how bout xmega?
[07:14:13] <antto> i don't understand asm, but iirc they had slightly different instructions
[07:14:43] <Levitator> Looks like it does what I want to registers, but not to memory. That's disappointing.
[07:14:59] <Levitator> Oh, well. Can just disable interrupts like everyone else, then.
[07:15:31] <antto> i use ATOMIC_BLOCK(ATOMIC_RESTORE_STATE) or something such
[07:15:34] <Levitator> Seems like kind of blunt instrument, but what do you want from a $3 computer?
[07:16:08] <antto> i know there was an allwinner cpu with x86 instructions
[07:18:31] <Lambda_Aurigae> technically, registers are just memory locations.
[07:18:41] <Lambda_Aurigae> the first 32 memory locations, in the case of the AVR
[07:19:20] <antto> what operation does he want to do exactly? (since i'm not an asm guru)
[07:19:33] <Levitator> Yeah, great, so that means disabling a register and all of the associated opcodes that write it in order to do lockless stuff.
[07:20:11] <Levitator> Presuming that there is some way to express that to the compiler, like maybe by binding a variable to the register.
[07:20:25] <antto> btw, xmega has virtual ports
[07:20:39] <Levitator> What are virtual ports?
[07:20:44] <antto> maybe they can be exploited somehow
[07:20:44] <Lambda_Aurigae> oh, you are wanting to do this in C?
[07:21:09] <Levitator> I usually write in C++.
[07:21:22] <Lambda_Aurigae> hehe.
[07:21:24] <Levitator> The register option could be reasonable for a very small program in assembler.
[07:21:34] <Lambda_Aurigae> all kinds of overkill for a microcontroller in my opinion, but to each their own.
[07:21:43] <antto> Levitator i think their idea is that you can do a pile of changes to them, and then cause them to be updated to an actual real port
[07:22:05] <antto> but i might be wrong, i haven't used that feature
[07:22:06] <Lambda_Aurigae> xmega just doesn't seem to be a very popular platform.
[07:22:30] <antto> xmega is much x, very mega!
[07:22:55] <Levitator> By the way, what's the most minimal MCU where you can expand the memory into the megabyte range?
[07:23:15] <Lambda_Aurigae> xmega
[07:23:22] <Levitator> Interesting.
[07:23:23] <Lambda_Aurigae> or pic32
[07:23:27] <Lambda_Aurigae> well,
[07:23:35] <Lambda_Aurigae> it's doable on a an 8052
[07:23:46] <Lambda_Aurigae> but you can only go in 64K blocks...have to do bank switching.
[07:23:57] <Levitator> Bank switching is nightmarish.
[07:24:12] <Lambda_Aurigae> I've done an 8052 board with 128K of nvsram that's switchable in 32K blocks into both program and data space.
[07:24:24] <Lambda_Aurigae> we did it on the c64 and c128 for many years.
[07:24:26] <Levitator> Flat or virtualized memory model ftw.
[07:24:26] <Lambda_Aurigae> worked well.
[07:24:30] <antto> xmega has options for moar RAM, and i code in C++ on them, i find them VERY nice
[07:24:43] <antto> haven't used extra RAM tho
[07:25:07] <Levitator> What sort of bus interface would you use for the addon RAM?
[07:26:15] <Lambda_Aurigae> it's likely going to be similar to the z80 ram interface I bet.
[07:26:31] <Lambda_Aurigae> haven't looked at the xmega for that.
[07:27:00] <Lambda_Aurigae> there are a couple of atmega chips that have extended ram interfaces too..but that's for data only and only out to 64K, so, again, bank switching.
[07:27:06] <Levitator> I have no idea what addon RAM is like on these tiny machines. I have never seen it.
[07:27:19] <Lambda_Aurigae> one would have to look at the datasheet.
[07:27:36] <Levitator> I wouldn't be surprised if basic configurations use something like SD-card IO to implement it.
[07:27:41] <Levitator> In order to keep the wire count down.
[07:27:57] <Lambda_Aurigae> some can use SD card for expanded ram..but that's gonna be horribly slow.
[07:28:11] <Levitator> Would expect the latency to be substantial.
[07:28:59] <Lambda_Aurigae> http://www.atmel.com/devices/ATXMEGA128A1.aspx
[07:29:07] <Lambda_Aurigae> that one has the EBI..external bus interface.
[07:29:25] <Levitator> Otherwise, you're going to need an awful lot of wires. At least 8 for data, and 16 more for address if you're going to be able to fetch a byte in a reasonable number of cycles.
[07:29:58] <Levitator> Well, maybe less than 16 address lines depending on your memory space size.
[07:30:01] <Lambda_Aurigae> on the atmega chips, they use a multiplexed address/data bus
[07:30:30] <Lambda_Aurigae> the EBI on the xmega as 3 modes..
[07:30:41] <Levitator> That makes sense. So, maybe 8 wires, and you need three clocks or so for a fetch.
[07:31:00] <Lambda_Aurigae> for the atmega external ram interface it is 16 wires.
[07:31:09] <Lambda_Aurigae> 8 for address high...8 for address low and data
[07:31:37] <Levitator> atmega does that? So, does that include the 328p?
[07:31:45] <Lambda_Aurigae> no..only some atmega chips.
[07:31:53] <Lambda_Aurigae> atmega8515 and atmega128 both do that I know of.
[07:31:57] <Lambda_Aurigae> but that's 64K at a time max.
[07:32:20] <Levitator> That sure beats 2K.
[07:32:33] <Lambda_Aurigae> I use the atmega1284p that has 16K sram.
[07:32:44] <antto> i thought "DMA" is what's used for adding external RAM
[07:32:45] <Lambda_Aurigae> or the pic32mx270f256b that has 64K sram.
[07:33:06] <Lambda_Aurigae> dma is direct memory access..that is for peripherals to access memory without the main cpu being involved.
[07:33:43] <Lambda_Aurigae> http://www.atmel.com/Images/doc8077.pdf
[07:33:54] <antto> ah, so then this external bus interface.. yeah some xmegas have it
[07:33:55] <Lambda_Aurigae> page 263 starts the external bus interface description.
[07:34:34] <Lambda_Aurigae> apparently up to 128MB using 3 ports.
[07:34:43] <Lambda_Aurigae> that is, with SDRAM
[07:34:54] <Levitator> Wows.
[07:34:54] <Lambda_Aurigae> 16MB using SRAM
[07:35:10] <Lambda_Aurigae> with the plain SRAM interface you can also do memory mapped peripherals.
[07:35:12] <Levitator> So you could build a "real" computer with that.
[07:35:20] <Lambda_Aurigae> and, it's after 7:00 and I need to head to work.
[07:35:32] <Lambda_Aurigae> will catch ya all on the flip side.
[07:35:37] <Levitator> Merry Friday.
[07:44:41] <antto> xmega is really like a much better atmega
[07:44:52] <LeoNerd> HELLYES
[07:45:01] <LeoNerd> Simply having identical units with same-shaped register maps is nice
[07:45:20] <antto> yeah, very C++ friendly too
[07:45:23] <LeoNerd> I think ATmega designers just roll a d16 twice to decide where all the register ports are
[07:46:03] <antto> the pinouts are also the same
[07:46:22] <LeoNerd> (If you don't have a d16 to hand you can make one up out of a d8 and a d2 (aka "a coin"))
[07:46:34] <antto> so you have for example 44pin xmega "A" with a range of flash sizes, all on the same datasheet
[07:46:41] <antto> same pinout, same everything
[07:46:41] <LeoNerd> Yah
[07:49:00] <antto> i also love the synchronous usart ;P~
[08:00:04] <Levitator> I wonder if I'll be able to keep this IO library under 10kB.
[08:00:20] <Levitator> It makes all IO asynchronous and event-driven.
[08:00:37] <Levitator> Or, at least, that's the plan.
[08:02:12] <Levitator> Also implements a String, Vector, and Ringbuffer class, so all the comforts of home.
[08:22:03] <skz81> <LeoNerd> (If you don't have a d16 to hand you can make one up out of a d8 and a d2 (aka "a coin")) >> and if U need a d24 just take a d8 and a d3.... hrmmmm
[08:22:15] <skz81> :) :) :)
[08:22:25] <LeoNerd> d3 is pretty rare
[08:22:32] <skz81> ho ? crap !
[08:22:43] <LeoNerd> Oh. I suppose it's a dual-labeled d6
[08:22:45] <LeoNerd> :)
[08:24:14] <skz81> Heh, does the job, indeed !
[08:47:15] <Levitator> Damn, these are some powerful antidepressants.
[08:47:37] <Levitator> Almost feel productive.
[08:51:30] <carabia> oh so that's why you're doing 10 kbytes of useless libs for avr
[08:53:05] <Levitator> Yeah, it's useless, which is why it's typical for people to use the crappy Arduino library to do everything in the foreground using blocking calls.
[08:53:46] <Levitator> The entire whole processor is fully interrupt driven, but there is no library that takes advantage of that.
[08:54:29] <Levitator> So, if you wanted to transfer 9.6kB at 9600baud, you would have to freeze the entire unit for >1 sec.
[08:54:58] <carabia> arduino is the maker black for internet of thongs(tm) now shh
[08:55:18] <Levitator> Sorry, I know my typing is bad, but come again?
[08:55:25] <LeoNerd> I prefer to think of it as "Duplo"
[08:55:51] <LeoNerd> Levitator: I write almost all my AVR code in a very interrupty style
[08:56:01] <carabia> the darkness is creeping back into your mind, Levitator
[08:56:04] <carabia> feel the coming depression
[08:56:13] <LeoNerd> The toplevel scheduler is a little task selector; anything that might block in fact just posts a message to a "task ID" to wake it up again
[08:57:18] <Levitator> I'm basing this design on a system of pipelines that have the hardware at one end, and the application at the other.
[08:57:35] <Levitator> So, everything is a linked list of operations performed on a stream of incoming events.
[08:57:54] <Levitator> Or a stream of outgoing events, as the case may be.
[08:59:20] <Levitator> And the first item in the pipeline is a ringbuffer, so that everything that happens after that continues with events enabled.
[09:01:38] <antto> i use interrupt-driven USART
[09:02:11] <Levitator> This will enable to use all inputs as UARTs, asynchronously.
[09:14:19] <Levitator> I wonder if this concept I'm messing with is by some total coincidence in any way analogous to IRQL in Windows.
[10:50:35] <Levitator> Man, there is a lot of black magic going on in the asm blocks.
[10:51:11] <Levitator> WTF is this: static __inline__ void __iCliParam {cli(); __asm__ volatile ("" ::: "memory"); (void)__s;}
[10:51:37] <Levitator> Ok, first of all, that is an empty ASM block that is declared as clobbering memory. Why? Who knows, since it doesn't do anything.
[10:52:20] <Levitator> Then, after that, there is a unit8_t pointer which is cast to void, and also does nothing. It's just evaluated as a pointer.
[10:52:26] <Levitator> What the hell is all this for?
[10:53:40] <Levitator> Why would you even pass a pointer into a function just to put it in an expression by itself? What does that even do. Nothing?
[11:07:18] <twnqx> Levitator: i guess it's a memory barrier to prvent the cli from being reordered
[11:11:28] <Levitator> Ughhhh. How horrible.
[11:11:41] <DKordic> Realy!?
[11:12:02] <Levitator> They should have at least named it reasonably so that you could tell what the hell they were doing.
[11:12:05] <DKordic> Where is that shit from?
[11:12:46] <Levitator> From the AVR libc headers.
[11:13:54] <Levitator> So, in respect to what could cli() get reordered?
[11:14:20] <twnqx> instruction rescheduling due to inlining
[11:15:03] <twnqx> i mean, an optimizer can reorder statements that do not depend on each other at wil
[11:15:04] <twnqx> l
[11:15:31] <Levitator> Right, but we know that everything after cli() depends on cli()
[11:15:35] <twnqx> http://www.atmel.com/webdoc/AVRLibcReferenceManual/optimization_1optim_code_reorder.html
[11:15:37] <twnqx> _we_ do
[11:15:44] <twnqx> the memory barrier tells it to the compiler
[11:15:50] <Levitator> Ah, thanks. I knew I saw that somewhere, and I couldn't remember where.
[11:16:32] <twnqx> i must say that this example looks better :P
[11:17:03] <twnqx> also, i'd probably do something like #define BARRIER ... and later use that
[11:17:59] <Levitator> Yeah, I was wondering why they didn't do something like that.
[11:18:01] <twnqx> and i think what the avr libc does in the above quoted code is a more elaborate version that is more sure to stay in order
[11:21:31] <Levitator> Yeah. This note is very interesting, but it makes no comment about the weird void expression. It's probably related, but it's not clear how.
[11:28:08] <Levitator> Doesn't entirely make sense, though. Says correctness is determined by access to volatiles. If that were true, then you could never count on a = 1; a = 2; to produce a consistent result since neither are volatile.
[11:38:42] <twnqx> but both affect the same variable
[11:39:09] <twnqx> i'd recommend asking the author, the avr-libc mailing list or even the avr-gcc mailing list
[11:39:09] <antto> * the optimizer puts on his smart hat