Advogato blog for slambhttp://www.advogato.org/person/slamb/
Advogato blog for slamben-usmod_virguleFri, 9 Dec 2016 15:27:02 GMTFri, 21 Sep 2007 22:17:34 GMT21 Sep 2007http://www.advogato.org/person/slamb/diary.html?start=64
http://www.advogato.org/person/slamb/diary.html?start=64<b><a
href="http://www.advogato.org/person/Akira/diary/73.html">akira</a>,
re: deleting code</b>
<p> <blockquote>
One of my most productive days was throwing away 1000 lines
of code.<br>
--<br>
Ken Thompson
</blockquote>
<p> Interesting. One of my most productive days was throwing
away 15000 lines of code.
<p> <p>A consequence of the increased scale of systems? Maybe;
probably also apples and oranges. These 15000 lines of code
were written by a poorly supervised contractor, and Ken
Thompson's 1000 lines were probably his own work.Fri, 8 Jun 2007 23:49:48 GMT8 Jun 2007http://www.advogato.org/person/slamb/diary.html?start=63
http://www.advogato.org/person/slamb/diary.html?start=63<b>spam flags</b>
<p> I mentioned before that Thunderbird and Mail.app have slightly different flags
for indicating that a message is ham rather than spam. Well, their interaction
seemed to be even weirder than that alone would explain - if a message was
marked as not junk in Mail.app, no attempt to mark it as junk in Thunderbird
would stick. Look for <a
href="http://lxr.mozilla.org/mailnews/search?string=NonJunk">NonJunk
</a> and you'll find this (reformatted to fit your television):
<p> <pre>
PRBool messageClassified = PR_TRUE;
...
if (FindInReadable(NS_LITERAL_CSTRING("NonJunk"), keywords...)
mDatabase-&gt;SetStringProperty(uidOfMessage, "junkscore", "0");
// Mac Mail uses "NotJunk"
else if (FindInReadable(NS_LITERAL_CSTRING("NotJunk"), keywords...)
mDatabase-&gt;SetStringProperty(uidOfMessage, "junkscore", "0");
// ### TODO: we really should parse the keywords into
// space delimited keywords before checking
else if (FindInReadable(NS_LITERAL_CSTRING("Junk"), keywords...)
{
PRUint32 newFlags;
dbHdr-&gt;AndFlags(~MSG_FLAG_NEW, &amp;newFlags);
mDatabase-&gt;SetStringProperty(uidOfMessage, "junkscore", "100");
}
else
messageClassified = PR_FALSE;
</pre>
<p> On startup, Thunderbird says that a message is not junk if Mail.app said it
was NotJunk. When marking a message as Junk, it doesn't clear Mail.app's
NotJunk flags. Brilliant! How could this plan possibly fail?
<p> What annoys me is that Thunderbird added this feature after Mail.app but
made a subtle change that broke interoperability. Then they realized their
parsing sucked and they were interpreting Mail.app's NotJunk as saying Junk.
They fixed it with this hack job and the bug popped up elsewhere - now
Thunderbird's attempt to change the marking to junk won't stay across
restarts. A little forethought and there wouldn't have been this mess.Fri, 8 Jun 2007 04:23:37 GMT8 Jun 2007http://www.advogato.org/person/slamb/diary.html?start=62
http://www.advogato.org/person/slamb/diary.html?start=62<b>Training server-side Bayesian filters</b>
<p> Last night I worked on an unobtrusive way to train SpamAssassin's Bayesian
database. (Autotraining sure spam and ham as it's delivered is nice, but you
at least need a way of correcting its mistakes or it will keep making them.)
The <a
href="http://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html">
sa-learn</a> utility is quite easy to use, but how do you specify what
messages to feed to it? I haven't seen any good glue for this. You want to
feed it messages which have been examined and categorized, and ideally you
want to feed it each message exactly once. (<tt>sa-learn</tt> does realize
that it's seen a message before, but it still takes some processing time to do
even that.)
<p> I decided to harness the power of <a
href="http://www.ietf.org/rfc/rfc2060.txt">RFC 2060</a>. My trainer
connects via IMAP4rev1, executes a <tt>SEARCH</tt> command for
candidates (letting the server do the work of an arbitrarily complex query),
downloads the messages and pipes them through <tt>sa-learn</tt>, flags
them as learned (so the next search will skip them), and disconnects. I
implemented it using <a
href="http://imapfilter.hellug.gr/">imapfilter</a>, and so far it works quite
well. This approach would even work well if the SpamAssassin machine were
separate from the mail store machine.
<p> In the process, I noticed that Thunderbird updates spam status on the IMAP
server in the <tt>Junk</tt> and <tt>NonJunk</tt> keywords. Mail.app does
the same, in the <tt>Junk</tt> and <tt>NotJunk</tt> keywords (plus a few
others). Did you see it? One uses <tt>No<b>n</b>Junk</tt>, the other
<tt>No<b>t</b>Junk</tt>. How hard would it have been to get these guys
in a room to fight this one out? Grr. They have a weird interaction because
they just didn't put any thought into it.
<p> I also tried out <a href="http://www.lua.org/" >Lua</a> for the first time, as
it's imapfilter's extension language. Turns out I hate it. I really wanted to like
it. I had been thinking of using it all over an embedded product for rapid
development with little resources. It's <a href="http://www.lua.org/manual/5.1/manual.html#8" >minimalist</a>,
fast, and so on. But it's just unpleasant to use. Maybe it's
<i>too</i> minimalist. I would have liked a separate array type (rather than
just "tables" / associate arrays), and I hate "high-level" languages without
exceptions. imapfilter's library is also a bit limiting - its
<tt>fetch_message</tt> and <tt>pipe_to</tt> do everything in memory.
That makes me more irritated that Lua doesn't just have an array slice syntax
I can use to pass message lists to <tt>fetch_message</tt>. And it means I
have to spawn <tt>sa-learn</tt> a bunch of times for reasonable memory
consumption, and starting a Perl process heavy with modules takes a long
time.
<p> I might end up rewriting my trainer in Python using either <a
href="http://docs.python.org/lib/module-imaplib.html">imaplib</a> and
<a href="http://docs.python.org/lib/module-
subprocess.html" >subprocess</a> or <a href="http://twistedmatrix.com/documents/current/api/
twisted.mail.imap4.html" >twisted.mail.imap4</a> and <a
href="http://twistedmatrix.com/documents/current/api/
twisted.internet.process.html">twisted.internet.process</a>. I'm not real
impressed with either mail API, though. I like the <a href="http://
java.sun.com/products/javamail/" >JavaMail</a> API better, but forking and
interacting with child processes from Java (or even <a
href="http://www.jython.org/">Jython</a>) sounds painful.Wed, 2 May 2007 22:02:05 GMT2 May 2007http://www.advogato.org/person/slamb/diary.html?start=61
http://www.advogato.org/person/slamb/diary.html?start=61<b><a href="http://www.advogato.org/person/clarkbw/diary.html?
start=106" >clarkbw</a>, re: security choices</b>
<p> <blockquote>C. You are connected to a site pretending to be
www.url.com &hellip;
Something evil could be going on! Someone might be trying to trick you!
Though odds are this isn&rsquo;t true, it&rsquo;s likely that guilt or the legal
department
required us to put this dialog up just for this case.</blockquote>
<p> No, no, no, no, no! This text is the <b>entire purpose</b> of SSL.
If it's
really unlikely, then thousands of people wouldn't have created an entire
ecosystem around validating identities. You have to realize that a private
conversation is totally worthless if you don't know who you are talking to, and
if nothing warns you when that validation fails, why would you have validation
at all? This text wasn't added by lawyers; it was added by people who just
spent man-centuries creating cryptosystems which would be absolutely
worthless if this text were not displayed.
<p> This dialog box shouldn't say "don't worry, this is probably
something wrong with their setup. Just go on, send them your credit card
number like always." That would defeat the purpose of the system so bady
I'm having trouble coming up with an analogy. It's sort of like a policeman
seeing someone trying to pick a lock and <b>opening it for them</b>, then
standing by, smiling, as they walk off with all the valuables the lock was
protecting. If you downplay the security concerns of sending important
information over this link, you're basically telling the lock "sometimes keys
screw up, just let him in." (I warned you the analogy sucked.)
<p> <p> <p> <p> <b>It should be alarming!</b> It needs to be alarming
enough that if someone goes to their bank's website and sees this dialog
box, they won't enter their password. Instead, they'll call their bank on the
telephone and tell them that they've spotted fraud. This is the correct action -
it's either true or it will get the correct people angry at the security people
who screwed up the configuration. It's very rare for a major bank to totally
botch their security setup like this.
<p> On the other hand, it shouldn't be so alarming that it will prevent people from
browsing some random untrusted website which they have no intention of
sending important information to. It's not uncommon for people to require
SSL on a site, not bother paying the money to have it signed by a widely-
trusted CA, and have instructions for people with particularly sensitive
passwords to import the certificate into their browser. That's not a site
configuration problem, either - it's a "you haven't given the computer a way
to verify their identity" problem.
<p> I agree that examining a certificate and finding the problem is unrealistic for
most people. Maybe the details of the certificate should be in an "Advanced"
pull-out or something.Wed, 2 May 2007 00:07:26 GMT2 May 2007http://www.advogato.org/person/slamb/diary.html?start=60
http://www.advogato.org/person/slamb/diary.html?start=60<b><a href="http://www.advogato.org/person/clarkbw/diary.html?
start=105" >clarkbw</a>, re: security choices</b>
<p> I'm not convinced there's a problem with the status quo. For the 90% of people
you describe, the SSL certificate dialog box comes down to this:
<p> <blockquote>Your connection to <tt>www.bigbank.com</tt> is insecure. It's
likely that people are trying to steal your money.
<p> Give them my money | Cancel</blockquote>
<p> My parents don't understand X.509 PKI, but they do understand that they care if
a connection is secure if and only if they plan to send financial credentials over
it. They know - and the computer doesn't - what information they are planning
to send. Thus, they are capable of responding to this dialog correctly 100% of
the time. Choosing either option for them would be right less than 100% of the
time. A complicated voting scheme would be right less than 100% of the time.Tue, 10 Apr 2007 00:01:51 GMT10 Apr 2007http://www.advogato.org/person/slamb/diary.html?start=59
http://www.advogato.org/person/slamb/diary.html?start=59<b><a href="http://www.advogato.org/person/apenwarr/diary.html?
start=265" >apenwarr</a>, re: tabbed MDI</b>
<p> <blockquote>Tabbed browsing is [...] less flexible [than MDI], because
there's
no way to display two documents side-by-side. Imagine if Photoshop used
tabbing between images: useless! (In fairness, the hybrid model used in
Firefox,
where you can open a new window or a new tab, is a really good balance. I
just
wish there was an easy way to "convert this tab into a window" or vice versa.)
</blockquote>
<p> That's not a fundamental limitation of tabbed MDI. Do you have a Mac? Open
<a href="http://www.adiumx.com/" >Adium</a> and with a couple
conversations.
You can drag the tabs - not only to rearrange them within a window, but also
to drag one out of one window into another and back. It's intuitive and even
has nice eye candy. (IIRC gaim has this same feature, though it doesn't look
as nice.) I'd post a trendy screencapture video if I knew how to do such things
easily. Every now and then I try to do it in Safari or Firefox and am
disappointed that it doesn't work.Sat, 24 Mar 2007 17:43:57 GMT24 Mar 2007http://www.advogato.org/person/slamb/diary.html?start=58
http://www.advogato.org/person/slamb/diary.html?start=58<b><a href="http://www.advogato.org/person/haruspex/diary.html?
start=255" >haruspex</a></b>
<p> I read the manual. Ironically, you did what you accused me of - not reading fully
before complaining. Read my full post, and you'll see a mention of bison and
(not coincidentally) the same bison pattern rule found in the GNU make manual.
Unfortunately, those tools are not universally available, and this particular
project is developed by a lot of BSD people. I anticipate requiring GNU make and
bison would be a hard sell.Sat, 24 Mar 2007 10:12:13 GMT24 Mar 2007http://www.advogato.org/person/slamb/diary.html?start=57
http://www.advogato.org/person/slamb/diary.html?start=57<b>make oddities</b>
<p> I'm trying to correctly express the dependencies for running yacc, which
produces multiple targets from a single invocation. Let's start with a rule from
racoon2's <tt>lib/Makefile.in</tt>:
<p> <pre>
.y.c:
$(YACC) $(YFLAGS) $&lt;
mv -f y.tab.c cfparse.c
</pre>
<p> There are three problems with this:
<ul>
<li>it has a hardcoded filename in a pattern rule,
<li>it has an intermediate file with a generic filename, which causes
problems when run in parallel. I use <tt>make -j4</tt>...this gets run in
parallel if there are multiple <tt>.y</tt> files, and in a non-obvious case I'll
mention below
<li>It doesn't mention the <tt>y.tab.h</tt> target that other files
(<tt>cftoken.o</tt>) depend on.
</ul>
<p> As far as I see, there's no way to express to make the generic intermediate
file problem. The best you could do is to use <tt>lockfile(1)</tt>. But it's not
universally available, and if I'm going to try convincing a project to switch to
tools not universally available, I might as well just try for <tt>bison</tt>,
which produces unique filenames directly. Now my rule can look like this:
<p> <pre>
%.tab.c %.tab.h: %.y
$(BISON) $(YFLAGS) $&lt;
</pre>
<p> This works under GNU make, and <i>almost</i> works under BSD make.
The problem there is that it's run twice. Easy to see with a test file:
<p> <pre>
.PHONY: all
all: foo.out1 foo.out2
<p> %.out1 %.out2: %.in
lockfile -r0 mylock
touch $*.out1 $*.out2
sleep 1
rm -f mylock
<p> clean:
rm -f mylock foo.out[12]
</pre>
<p> It works with <tt>gmake</tt> but fails with <tt>bsdmake</tt>. And here's
something odd: replace that pattern rule with a static one...
<p> <pre>
foo.out1 foo.out2: foo.in
lockfile -r0 mylock
touch foo.out1 foo.out2
sleep 1
rm -f lock
</pre>
<p> ...and GNU make fails, too. A non-intuitive difference between static and
pattern rules: static rules use multiple targets on a line to say that both
targets are made with similar commands (but different <tt>$@</tt>), while a
pattern one says that both targets are made with the exact same invocation.
<p> What about cheating by making one target depend on another?
<p> <pre>
%.tab.c: %.y
$(BISON) $(YFLAGS) $&lt;
<p> %.tab.h: %.tab.c
</pre>
<p> I thought about it, but there's no guarantee the targets are produced in a
particular order, and if they happen in the one opposite what I give, it will
rebuild. It might end up doing that over and over again if my choice is
consistently wrong.
<p> I guess what I can do is make <tt>cftoken.o</tt> depend on
<tt>cfparse.tab.c</tt>, as a surrogate for <tt>cfparse.tab.h</tt>. It's silly,
but it works.
<p> My conclusion: make sucks, and GNU make sucks a little less than most. But I
guess I already knew that from <a href="http://miller.emu.id.au/pmiller/
books/rmch/?ref=DDiyet.Com" >Recursive Make Considered Harmful</a>.
(Besides making the argument you'd expect from the title, it has some good
points like how GNU make's reparsing of changed include files put it a step
above the rest.)Sun, 11 Mar 2007 07:32:25 GMT11 Mar 2007http://www.advogato.org/person/slamb/diary.html?start=56
http://www.advogato.org/person/slamb/diary.html?start=56<b><a href="http://www.advogato.org/person/lkcl/diary.html?
start=366" >lkcl</a>, re: message passing</b>
<p> First, you're wrong about <tt>rename()</tt>: it's not atomic. There is an
intermediate step that processes can see. From the Linux manpage:
<p> <blockquote>
However, when overwriting there will probably be a window in which
both
oldpath and newpath refer to the file being renamed.
</blockquote>
<p> (The atomicity people say it provides is carefully limited...<tt>newpath</tt>
either refers to the old file or the new file. That's generally all you need.)
<p> Second, your statement that Linux lacks message passing and atomic
operations is false. Linux has many forms of message
passing between processes/threads - pipes/FIFOs/sockets, POSIX message
queues, etc, and they are entirely suitable for the task <a href="http://
www.advogato.org/person/pphaneuf/diary.html?start=312" >pphaneuf asked
about</a>. I'm not sure what atomicity guarantee he'd need that Linux
doesn't provide. You call <tt>write()</tt> to send a byte on a wake-up pipe,
and that byte is either in the buffer or it's not. There's no intermediate state;
therefore, it is atomic. What more do you want? Block until the other process/
thread has actually grabbed the byte? Why? You can build that if desired.
<p> Third, your mention of microkernels is a nonsequitur. I find Tanenbaum's
work, and the <a href="http://os.inf.tu-dresden.de/L4/" >L4</a> project
you're alluding to by mentioning those universities, to be quite interesting.
However, microkernels are not necessary to provide any particular IPC facility
for userspace processes to communicate amongst themselves.Fri, 9 Feb 2007 08:11:35 GMT9 Feb 2007http://www.advogato.org/person/slamb/diary.html?start=55
http://www.advogato.org/person/slamb/diary.html?start=55<a href="http://www.advogato.org/person/ncm/diary.html?
start=163" >ncm</a>, here are five tidbits you probably didn't know about
me:
<p> <ol>
<li>I first learned about the Internet by dialing <a href="http://
www.iscabbs.com/" >ISCABBS</a> through my dad's dual-speed 300/1200
baud modem when I was 10.
<p> <li>My official major in college was electrical engineering then computer
science, but my favorite classes and professors were in physics. If I hadn't
already loved computing for nine years before finding the physics for majors
classes, I'd be on a different path today.
<p> <li>When I drove from Iowa City to the Bay Area with a carload of belongings,
I showed up at the doorstep of <a href="http://www.spy.net/
~dustin/" >good</a> <a href="http://bleu.west.spy.net/
~noelani/" >friends</a> I met through ISCABBS. I'd never been to Northern
California or met them in person, and I lived with them for the next year.
<p> <li>I was once nearly thrown in a Tanzanian jail. The Arusha and Serengeti
offices of Tanzanian Immigration and Customs disagree on the legality of my
crossing the border by bicycle from Kenya so far from an official border post.
I probably shouldn't have taken the advice of a Kenyan who'd spent three
days in Egyptian military custody for entering without a visa.[*]
<p> <li>During that crossing, my friends and I went through a dozen inner tubes
and two dozen patches before running out twelve kilometers from our
destination and walking. Three days cycling from Karen, Kenya to Loliondo,
Tanzania - much of it on a rocky, hilly, windy earthen path that diverged
wildly from our map - will do that. We had to bum a 450 km ride to Arusha
where we could get replacements. Probably the stupidest thing I've ever done,
but the trip was one of the greatest experiences of my life. The people were
kind, generous, and amazing in the truest sense of the word.
</ol>
<p> [*] "What did you <i>see</i>?" "Camels and sand. Camels and sand. What do
<i>you</i> see?" But all worked out for him, too - he met his wife in Egypt.