I've just improved the stream reading code so that it uses less RAM. Before, if you did read(2048), then it allocated 2048 bytes of RAM from the heap, read into this buffer, then allocated a new buffer of 2049 bytes (one extra for the null terminator), copied the data across, then freed the first buffer. It was implemented this way because the read function didn't know how many bytes it would actually get (it might get less than 2048), and so it needed to shrink the buffer size down after knowing exactly how many bytes it read.

Well, now it's improved. This was some rather old code that was written before we could easily shrink RAM chunks. So now you might find much better behaviour with reading files.

Ok, thank you for staying engaged on that issue.
I will wait for the fix on the main branch.
The way I see it, the potential is still high to break the application when manipulating blocks of memory, which is a fundamental part of a scripting language. One way would be to have a lot of free RAM, that should reduce the probability of fragmentation, another way would be to restart the system periodically (if the application permits) while saving/restoring system state variables.If the GC can allocate fixed block sizes, that could also reduce fragmentation.
I will fix this problem with both options, unless some better idea comes up.

nelfata wrote:
If there is something we can do about running a process periodically to get the memory chunks moved that would be great, or if there is a way in the memory allocator to control the allocated bytes. For example for small buffers allocate a multiple of 64 bytes, and for larger buffer a multiple of 1024.

This is exactly what MicroPython does - it allocates memory in chunks of 16 bytes. This is already wastes a lot of memory (average 8 bytes per object).

nelfata wrote:
The way I see it, the potential is still high to break the application when manipulating blocks of memory, which is a fundamental part of a scripting language.

Yes, this potential is high for any language with automatic memory management, not just scripting languages. For example, languages like Java are Go are also affected. Actually no, it's not even about automatic memory management - with manual memory management, you can write an application which will eventually collapse due to fragmentation just as easily.

One way would be to have a lot of free RAM,

That doesn't really help in the long run. For example, Java apps love to consume gigabytes of RAM, then gigabytes of swap (where they become very slow), before eventually hang or crash just the same.

that should reduce the probability of fragmentation, another way would be to restart the system periodically (if the application permits) while saving/restoring system state variables.If the GC can allocate fixed block sizes, that could also reduce fragmentation.
I will fix this problem with both options, unless some better idea comes up.

So, most scripting languages allow to write programs easily, and stop there. Python goes further - it also allows to write efficiently. But for that, programs should be written in somewhat different way. That's the whole idea behind MicroPython (at least for me) - you can get the best of both worlds, but for "efficient" part you likely will need to learn something new. Writing Python apps in the same way as you do on "desktop", won't get you very far on MicroPython.

I am more of an embedded developer than a high level programmer. So I understand your concepts, but in Python in many cases it is not as straightforward to think that way, but I will use whatever MP has to offer now that this functionality is available.

I am sympathetic to those who worry about memory fragmentation. It is a real problem, and needs to be dealt with if uPy is going to be used in serious, robust embedded applications.

Well, programming in a way which is concious of memory fragmentation will go a long way to helping. Eg, preallocating bytearray objects, or premaking lists of a certain length and filling the end with None's.

When I first started desigining uPy I seriously considered a more managed heap so that it could have a compacting garbage collector. But I could not figure out a way to do that efficiently, since you need to access all arrays (in C) using offsets from a managed base pointer, among other things.

But I think even with the current memory implementation we are able to do some compacting. For example, for a given GC chunk of memory, if all the things that look like pointers to this chunk are *known* to be pointers, then we can move the chunk and rewrite all the pointers. How can we guarantee that all pointers are really pointers? Well, if all candidate pointers are in the heap themselves (ie nothing pointing from bss, stack or registers), and they are all within chunks that are known to be Python objects, and we know the type of all these Python objects, then we can be sure.

We can then go further and make a special pointer_bss segment which is known to hold only pointers and then these pointers can be rewritten as well.

Ultimately you can also make the stack and registers "typed" so you are guaranteed at any point in the code to know where all the pointers are. But this is hard and costly since you need to store lots of information about each function and the stack layout.

The original Macintosh (you know - the one with the 68000 and 128K of RAM) had a compacting heap.

They did it by using double indirection. So instead of getting a pointer to an object, you got a pointer to a pointer to an object.

They used a Master Table, which had all of the pointers to the objects in the heap. The 68000 only had 24-bit addressing, so they packed some flags into the upper byte.

You could lock an object (which returned a direct pointer to the object) which prevented the object from being moved. This was often used for performance optimizations and required for data used by ISRs.

When you did an allocation, there was a function to move your object as high as if would go on the heap. This was useful for objects with a long lifetime.

Yep, precise garbage collector sounds inefficient in terms of CPU time, double indirection - in terms of memory. Note that we could try to save memory for double indirection by using based pointers. If memory allocation happens in 16 byte blocks, no need to store lowest bits in indirection array. Then 16 bit indexes will allow to address 64K * 16 = 1M of heap. But I personally don't think compaction is worth the effort (until proven otherwise).