*Some points on public-inbox@ 2018-06-09 17:06 Leah Neukirchen
2018-06-12 10:09 ` Eric Wong0 siblings, 1 reply; 13+ messages in thread
From: Leah Neukirchen @ 2018-06-09 17:06 UTC (permalink / raw)
To: meta
Hi,
over the last few days I've set up a public-inbox 1.1.0pre1 instance,
and noticed some things:
1) Makefile.PL only works properly when run from a checkout, not a tarball.
I replaced the beginning with
my @EXE_FILES = split("\n", `printf '%s\n' script/* 2>/dev/null`);
my $PM_FILES = `find lib 2>/dev/null`;
2) public-inbox-mda returns with status 1 when it gets a mail it
doesn't know where to deliver to. I think status 67 would be more
appropriate (EX_NOUSER).
3) IPv6 support needs the Socket6 module, this is not stated anywhere.
4) I think it would be useful if the thread overview displayed
the name of the initial poster, could this be added as an option?
5) Is there a way for the HTML view to list all served lists?
/ results in 404. How did you add links to meta/ and test/ on
https://public-inbox.org/ ?
6) I have a user account that uses .forward to call public-inbox-mda,
and use /etc/aliases to route the lists that are hosted primarily on
the server to it. What's the best approach to do this for mailing
lists I only mirror? Subscribe with a "secret" second address to the
list, and add this second adress to publicinbox.<name>.address?
Or can public-inbox-mda also scan for List-Id etc and sort by it somehow?
Thank you very much for your work,
--
Leah Neukirchen <leah@vuxu.org> http://leah.zone
^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: Some points on public-inbox
2018-06-09 17:06 Some points on public-inbox Leah Neukirchen
@ 2018-06-12 10:09 ` Eric Wong
2018-06-12 11:31 ` Leah Neukirchen
` (2 more replies)0 siblings, 3 replies; 13+ messages in thread
From: Eric Wong @ 2018-06-12 10:09 UTC (permalink / raw)
To: Leah Neukirchen; +Cc: meta
Leah Neukirchen <leah@vuxu.org> wrote:
> Hi,
>
> over the last few days I've set up a public-inbox 1.1.0pre1 instance,
> and noticed some things:
Hey Leah, thanks for giving it a try! Sorry for the late reply,
been trying to avoid being at the computer too much for health
reasons.
> 1) Makefile.PL only works properly when run from a checkout, not a tarball.
> I replaced the beginning with
>
> my @EXE_FILES = split("\n", `printf '%s\n' script/* 2>/dev/null`);
> my $PM_FILES = `find lib 2>/dev/null`;
Thanks, I'd probably add "-name '*.pm'" to find(1) to filter out
directories. But I wonder if it's better to grep the MANIFEST
file...
> 2) public-inbox-mda returns with status 1 when it gets a mail it
> doesn't know where to deliver to. I think status 67 would be more
> appropriate (EX_NOUSER).
Sure. There's a bunch of places where we just die() and ignore
sysexits.h or similar. Could use some help checking for that
and patches are welcome :>
> 3) IPv6 support needs the Socket6 module, this is not stated anywhere.
Oops, I thought this was standard :x Care to send a patch to
INSTALL for that?
> 4) I think it would be useful if the thread overview displayed
> the name of the initial poster, could this be added as an option?
If anything, I'd rather list ALL the recent participants in a
thread (it wouldn't require extra lookups).
But my philosophy is not to give anybody more credit/weight than
anybody else; and I'd rather people follow links if the Subject
seems interesting, not because who started a particular topic
(especially when it comes to emails with [PATCH] subjects).
I also prefer to avoid having too many options to reduce
support/documentation costs, so if we do something like this,
it would be the default. But also, it's more clutter.
> 5) Is there a way for the HTML view to list all served lists?
Not currently... I'm not sure how the UI or configuration
should be or how to avoid clutter/scalability problems with many
inboxes. NNTP has standardized commands and clients can decide
how to show them, at least.
> / results in 404. How did you add links to meta/ and test/ on
> https://public-inbox.org/ ?
mkdir /srv/public-inbox/{meta,test} # static directory listing
Rack::Builder routes meta/ and test/ to varnish => public-inbox-httpd
All the HTTPS, static files and reverse proxying is handled
using yet-another-horribly-named-server written in Ruby and Rack :)
==> config.ru snippet <==
# random crap from yahns extras/
require "autoindex"
require "try_gzip_static"
require "yahns/proxy_pass"
autoindex = lambda do |path|
Autoindex.new(TryGzipStatic.new(path), skip_dotfiles: true)
end
pi = Rack::Builder.new do
run Yahns::ProxyPass.new('http://127.0.0.1:6081', # varnish
response_headers: {
'Age' => :ignore,
'X-Varnish' => :ignore,
'Via' => :ignore
})
end.to_app
unsub = Rack::Builder.new do
run Yahns::ProxyPass.new('unix:/run/unsubscribe-psgi.sock')
end.to_app
unsub_re = %r{\A/u/[^/]+/[^/]+\z}
map('http://public-inbox.org/') do
pfx = 'public-inbox'
static = autoindex["/srv/#{pfx}"]
txt_html = %r{\.txt\.html\z} # oops :x
proxy = Yahns::ProxyPass.new("http://127.0.0.1:2080/$host$fullpath",
proxy_buffering: false)
cascade = Rack::Cascade.new([static, proxy])
run(lambda do |env|
return redirect(env, nil, nil) if env['rack.url_scheme'] == -'http'
case path_info = env["PATH_INFO"]
when %r{\A/test(?:/?.*)\z}
redirect(env, "try.public-inbox.org", path_info)
when %r{\A/(?:git|meta|public-inbox(?:\.git)?)(?:/|\z)}x,
'/HEAD', '/info/refs', '/git-upload-pack', '/description', '/cloneurl',
%r{\A/objects/}
pi.call(env)
when unsub_re
unsub.call(env)
when txt_html
redirect(env, 'public-inbox.org', path_info.sub(txt_html, '.html'))
else
cascade.call(env)
end
end)
end
==> end snippet <=
> 6) I have a user account that uses .forward to call public-inbox-mda,
> and use /etc/aliases to route the lists that are hosted primarily on
> the server to it. What's the best approach to do this for mailing
> lists I only mirror? Subscribe with a "secret" second address to the
> list, and add this second adress to publicinbox.<name>.address?
> Or can public-inbox-mda also scan for List-Id etc and sort by it somehow?
I prefer to use public-inbox-watch for mirroring existing lists.
-mda is also a bit strict and opinionated (though I have plans to
make it less so, optionally), so it's mainly for non-mirrored
inboxes.
-watch is also safer and less likely to lose/bounce mail since
it hits a Maildir, first. -watch will scan for List-Id (or any
other header, such as X-Mailing-List) and put it into the
correct inbox. If space is a problem, a cronjob to remove
old files will help, but maybe it can unlink-on-import-commit
in the future.
I haven't thought much about mirroring with -mda, but I suppose
having a per-list subscriber address and extra
publicinbox.<name>.address entry works, too.
> Thank you very much for your work,
No problem :>
^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: Some points on public-inbox
2018-06-12 10:09 ` Eric Wong@ 2018-06-12 11:31 ` Leah Neukirchen
2018-06-13 2:07 ` [PATCH] Makefile.PL: do not depend on git Eric Wong
2018-06-13 21:40 ` Some points on public-inbox Eric Wong
2018-06-12 13:19 ` Some points on public-inbox Leah Neukirchen
2018-06-12 17:05 ` Konstantin Ryabitsev2 siblings, 2 replies; 13+ messages in thread
From: Leah Neukirchen @ 2018-06-12 11:31 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
> Leah Neukirchen <leah@vuxu.org> wrote:
>> Hi,
>>
>> over the last few days I've set up a public-inbox 1.1.0pre1 instance,
>> and noticed some things:
>
> Hey Leah, thanks for giving it a try! Sorry for the late reply,
> been trying to avoid being at the computer too much for health
> reasons.
No problem, get well soon.
>> 1) Makefile.PL only works properly when run from a checkout, not a tarball.
>> I replaced the beginning with
>>
>> my @EXE_FILES = split("\n", `printf '%s\n' script/* 2>/dev/null`);
>> my $PM_FILES = `find lib 2>/dev/null`;
>
> Thanks, I'd probably add "-name '*.pm'" to find(1) to filter out
> directories. But I wonder if it's better to grep the MANIFEST
> file...
Yes, using MANIFEST is a better solution.
>> 2) public-inbox-mda returns with status 1 when it gets a mail it
>> doesn't know where to deliver to. I think status 67 would be more
>> appropriate (EX_NOUSER).
>
> Sure. There's a bunch of places where we just die() and ignore
> sysexits.h or similar. Could use some help checking for that
> and patches are welcome :>
I'll have a look at this.
>> 3) IPv6 support needs the Socket6 module, this is not stated anywhere.
>
> Oops, I thought this was standard :x Care to send a patch to
> INSTALL for that?
Will do.
>> 5) Is there a way for the HTML view to list all served lists?
>
> Not currently... I'm not sure how the UI or configuration
> should be or how to avoid clutter/scalability problems with many
> inboxes. NNTP has standardized commands and clients can decide
> how to show them, at least.
Yes, I was thinking of just having a list of "name - description",
straight from the config file?
>> / results in 404. How did you add links to meta/ and test/ on
>> https://public-inbox.org/ ?
>
> mkdir /srv/public-inbox/{meta,test} # static directory listing
I built something similar with nginx now.
>> and use /etc/aliases to route the lists that are hosted primarily on
>> the server to it. What's the best approach to do this for mailing
>> lists I only mirror? Subscribe with a "secret" second address to the
>> list, and add this second adress to publicinbox.<name>.address?
>> Or can public-inbox-mda also scan for List-Id etc and sort by it somehow?
>
> I prefer to use public-inbox-watch for mirroring existing lists.
>
> -mda is also a bit strict and opinionated (though I have plans to
> make it less so, optionally), so it's mainly for non-mirrored
> inboxes.
>
> -watch is also safer and less likely to lose/bounce mail since
> it hits a Maildir, first. -watch will scan for List-Id (or any
> other header, such as X-Mailing-List) and put it into the
> correct inbox. If space is a problem, a cronjob to remove
> old files will help, but maybe it can unlink-on-import-commit
> in the future.
Space is not an issue, and scanning for special headers will avoid
getting password reminders and administrative messages into the archive.
I'll use watch then.
During testing, we also found another thing when obscure characters
are used in Message-IDs, esp. / and ?.
E.g. using a Message-ID of <F1WYEAZPOF.3LOD2T7ZHY9I1@localdomain/raw/T>
will create a corrupt link. Some more "ideas" are at
https://inbox.vuxu.org/pi-test/
Thanks,
--
Leah Neukirchen <leah@vuxu.org> http://leah.zone
^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: Some points on public-inbox
2018-06-12 10:09 ` Eric Wong
2018-06-12 11:31 ` Leah Neukirchen@ 2018-06-12 13:19 ` Leah Neukirchen
2018-06-12 17:05 ` Konstantin Ryabitsev2 siblings, 0 replies; 13+ messages in thread
From: Leah Neukirchen @ 2018-06-12 13:19 UTC (permalink / raw)
To: Eric Wong; +Cc: meta
Eric Wong <e@80x24.org> writes:
>> 6) I have a user account that uses .forward to call public-inbox-mda,
>> and use /etc/aliases to route the lists that are hosted primarily on
>> the server to it. What's the best approach to do this for mailing
>> lists I only mirror? Subscribe with a "secret" second address to the
>> list, and add this second adress to publicinbox.<name>.address?
>> Or can public-inbox-mda also scan for List-Id etc and sort by it somehow?
>
> I prefer to use public-inbox-watch for mirroring existing lists.
>
> -mda is also a bit strict and opinionated (though I have plans to
> make it less so, optionally), so it's mainly for non-mirrored
> inboxes.
>
> -watch is also safer and less likely to lose/bounce mail since
> it hits a Maildir, first. -watch will scan for List-Id (or any
> other header, such as X-Mailing-List) and put it into the
> correct inbox. If space is a problem, a cronjob to remove
> old files will help, but maybe it can unlink-on-import-commit
> in the future.
Ok, that was a bit more fiddly than expected, because now I needed to
run two MDA under the same user.
Also I noticed public-inbox-watch expects different Maildir for every
list, so I had to put a maildrop in front of it to pre-sort by
List-Id... couldn't -watch do that itself?
--
Leah Neukirchen <leah@vuxu.org> http://leah.zone
^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: Some points on public-inbox
2018-06-12 10:09 ` Eric Wong
2018-06-12 11:31 ` Leah Neukirchen
2018-06-12 13:19 ` Some points on public-inbox Leah Neukirchen
@ 2018-06-12 17:05 ` Konstantin Ryabitsev
2018-06-13 1:57 ` Eric Wong2 siblings, 1 reply; 13+ messages in thread
From: Konstantin Ryabitsev @ 2018-06-12 17:05 UTC (permalink / raw)
To: Eric Wong; +Cc: Leah Neukirchen, meta
[-- Attachment #1: Type: text/plain, Size: 1617 bytes --]
On Tue, Jun 12, 2018 at 10:09:15AM +0000, Eric Wong wrote:
>I prefer to use public-inbox-watch for mirroring existing lists.
>
>-mda is also a bit strict and opinionated (though I have plans to
>make it less so, optionally), so it's mainly for non-mirrored
>inboxes.
>
>-watch is also safer and less likely to lose/bounce mail since
>it hits a Maildir, first. -watch will scan for List-Id (or any
>other header, such as X-Mailing-List) and put it into the
>correct inbox. If space is a problem, a cronjob to remove
>old files will help, but maybe it can unlink-on-import-commit
>in the future.
I opted in favour of -mda over -watch because Maildir performance
usually degrades linearly with the number of messages. A month of LKML
mail is anywhere from 25,000 to 40,000 messages, and maildirs tend to
handle that poorly due to peformance overhead of listing tens of
thousands of files in a single folder.
Obviously, I can set up an archival job, but then I'd have to worry
about messages that weren't actually imported into the archive (because
they didn't pass spam tests, but are actually ham, for example). The
-mda script gives me this for free, with such messages being put into
the emergency folder for later review.
>I haven't thought much about mirroring with -mda, but I suppose
>having a per-list subscriber address and extra
>publicinbox.<name>.address entry works, too.
It works, but cloning details at the bottom of the page expose both
addresses:
public-inbox-init -V2 lkml lkml/ https://[not-live-yet].kernel.org/lkml \
linux-kernel@[not-live-yet].kernel.org linux-kernel@vger.kernel.org
-K
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]^permalinkrawreply [flat|nested] 13+ messages in thread

*Re: Some points on public-inbox
2018-06-12 17:05 ` Konstantin Ryabitsev@ 2018-06-13 1:57 ` Eric Wong0 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2018-06-13 1:57 UTC (permalink / raw)
To: Konstantin Ryabitsev; +Cc: Leah Neukirchen, meta
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Tue, Jun 12, 2018 at 10:09:15AM +0000, Eric Wong wrote:
> > I prefer to use public-inbox-watch for mirroring existing lists.
>
> I opted in favour of -mda over -watch because Maildir performance
> usually degrades linearly with the number of messages. A month of LKML
> mail is anywhere from 25,000 to 40,000 messages, and maildirs tend to
> handle that poorly due to peformance overhead of listing tens of
> thousands of files in a single folder.
Right; but with inotify, getdents/readdir overhead is not a
problem outside of initial startup (or rescanning via SIGUSR1
after config changes).
> Obviously, I can set up an archival job, but then I'd have to worry
> about messages that weren't actually imported into the archive (because
> they didn't pass spam tests, but are actually ham, for example). The
> -mda script gives me this for free, with such messages being put into
> the emergency folder for later review.
Interesting take on it, thanks for sharing. I prefer to keep
the Maildir messages around for a bit and do my own reading off
that, for now[1]. I occasionally review syslog for spam notices
from -watch, but probably not enough :x
> > I haven't thought much about mirroring with -mda, but I suppose
> > having a per-list subscriber address and extra
> > publicinbox.<name>.address entry works, too.
>
> It works, but cloning details at the bottom of the page expose both
> addresses:
>
> public-inbox-init -V2 lkml lkml/ https://[not-live-yet].kernel.org/lkml \
> linux-kernel@[not-live-yet].kernel.org linux-kernel@vger.kernel.org
Hmm, I intended the multi-address support to work as a way to
have inboxes hosted simultaneously on multiple domains, either
temporarily as a migration strategy or permanently for redundancy.
So maybe there should be a way to specify an email address as
"hidden" for that, but still let -mda use it for routing.
Any thoughts on how to do it?
I'm thinking something like replacing '@' with '!' in the
.public-inbox/config file.
[1] I've thought about a Mairix/notmuch-like tool which extracts
messages from public-inboxes, so I won't need a redundant
copy in the Maildir.
^permalinkrawreply [flat|nested] 13+ messages in thread