Summary: lighttpd of course had some memory leaks (and perhaps even has today), but this bug is not about these problems.The main problem here is that the memory gets fragmented, and that is why malloc()/free() doesn't return the memory to the system; the memory is not lost to lighttpd, and lighttpd/malloc() will reuse the memory.

History

I can confirm that this is the case, using the FreeBSD port. In my test, it grew faster than that (60M in a couple hours); it seemed to grow linearly with traffic.

This bug in particular was a deal breaker for me. I was trying lighttpd out as an alternative to pound. Performance is good, and memory usage is good when it starts, but it's simply too broken for me to actually use. Maybe I'll check back in a few months.

Sorry, my previous description may not be accury. I run lighttpd 1.4.18, fastcgi and php5, with Website providing download service using php's readfile() method. Memory usage of process lighttpd growed rapidly beyond 1G in half an hour and later used out swap. If put another server lighttpd + mod_proxy + squid to cache access of my website, both merchine suffered memory growth. Latter I found removing readfile() method in the download service php page can solve the problem :P

Sorry again for my poor English and careless.In my previous post "merchine" should be "machines","accury" should be "accurate". And my servers are gentoo linux, kernel 2.6.22, php 5.2.14-p20070914-r2, lighttpd 1.4.18 only with mod_fastcgi and mod_accesslog. I have tried Apache to make sure php library and webpage are OK.

It has been now more than two months that i asked for a valgrind log, and no one even responded to that => dropping Priority/Severity to normal.

And I think most "leaks" reported here are just high memory usages due to big files sent from a backend (fastcgi/proxy); it is known that lighty caches the complete response in memory from the backend as fast as possible, and that memory is not freed but reused later.

high load goes down to normal, 20 * 10k is free'd but NOT the malloc(10) at the end of the peak since its cached.

As you know malloc() is using sbrk() which is at the end simply a large array of memory.If the memory at the end of this vector is used (eg. the memory is cached somewhere for the next thousand years) all free's inside this vector will be never really done.

I can confirm this bug. I run a medium-traffic server which has two daemons of lighttpd running, one normal and one SSL. It seems that the SSL daemon suffers more greatly from this bug than the other one, despite having more traffic running on the HTTP daemon. After 1 day and 9 hours the SSL daemon takes 53 MB RAM:

Currently the server is handling 2 ~350MB file downloads. When the server was serving 4 simultaneous downloads it became unable to display any webpages until at least 1 of the downloads was finished. The server is on a 10Mbps line, and it wasn't even running at 50% capacity so there was room for the requests.

This has been happening for a long time, just never said anything until now because I noticed when one person did they said it was an isolated incident. Well, its not.

with 16 GB of memory. I am using mod_proxy_core/mod_proxy_backend_http, serving no local files. The server has been up for 3.5 days and its memory usage has climbed to nearly 2 GB. During this time, lighttpd has proxied around 57 million requests, with at least a terabyte of response data.

I ran lighttpd for a while under valgrind and did not find any significant memory leaks. I have included the valgrind log.

I monitored lighttpd's memory usage over the course of half an hour and watched it grow from 25 MB to 1.1 GB in two short spurts where RSS grew by hundreds of megabytes. The rest of the time, memory usage grew very gradually or not at all.

I looked around through the code and started to suspect the following pointers, which can be reallocated frequently:

srv->conns->ptrsrv->joblist->ptrsrv->fdwaitqueue->ptr

When these pointers are reallocated, they stick around until the next realloc() call, which is more and more likely to occur when the load is high. One possible scenario that might trigger a realloc of one or more of these pointers is where you have large files getting streamed to high-bandwidth clients, which might starve other connections in the fdevent loop, leading to an increase in the number of connections and in the size of the joblist.

I tried preallocating more elements in these arrays, using the attached patch. I'll report later on the result.

This didn't work. I still saw lighttpd's memory usage climb to around 1 GB. It took longer to get there, but I can't say whether that's related to the changes I made. I did notice one other realloc() in fdevent.c (in fdevent_revents_add()) - this might be worth a look.

I think it should be possible to separate allocation/reallocation of memory that persists for a long time (i.e. the connection, fdwaitqueue, joblist arrays and so on) from allocation/freeing of memory that persists for a much shorter amount of time (namely, chunkqueues and the chunks and buffers that they contain). The global chunkpool is a little bit problematic - perhaps it should just be flushed every now and then.

Subject changed from memory leak to memory fragmentation leads to high memory usage after peaks

Status changed from New to Invalid

Assignee deleted (jan)

Missing in 1.5.x set to No

We got reports that this bug report is misleading, so I will close it.

Summary: lighttpd of course had some memory leaks (and perhaps even has today), but this bug is not about these problems.The main problem here is that the memory gets fragmented, and that is why malloc()/free() doesn't return the memory to the system; the memory is not lost to lighttpd, and lighttpd/malloc() will reuse the memory.

If you have new input regarding this bug (like good solutions for it, or just want to discuss it), please open a new bug (we can link it from here).Please do not reopen/reply to this one, thank you very much.