From andy at petdance.com Fri Sep 9 07:44:14 2011
From: andy at petdance.com (Andy Lester)
Date: Fri, 9 Sep 2011 09:44:14 -0500
Subject: [Chicago-talk] Mongo Chicago 2011
Message-ID: <79D22505-FC54-4AB3-B5FB-AA60AD2151C2@petdance.com>
Hello everyone,
Many of you are probably aware of this but I thought I'd send out a reminder in case you missed it. 10gen's Mongo Chicago 2011 is happening on October 18th. Registration is $50 if you register before September 20th. Proposals for presentations are accepted through September 15th. More details are available on 10gen's site.
Hope to see you there,
Seth
--
Please Note: If you hit "REPLY", your message will be sent to everyone on this mailing list (Chicago-MongoDB-User-Group-list at meetup.com)
This message was sent by Seth Mabbott (seth.mabbott at gmail.com) from Chicago MongoDB User Group.
To learn more about Seth Mabbott, visit his/her member profile
Meetup, PO Box 4668 #37895 New York, New York 10163-4668 | support at meetup.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From andy at petdance.com Fri Sep 9 07:50:52 2011
From: andy at petdance.com (Andy Lester)
Date: Fri, 9 Sep 2011 09:50:52 -0500
Subject: [Chicago-talk] Postgres Open conference in Chicago, September 14-16,
2011
Message-ID: <1EF55713-DAC7-496E-976A-67C70D064F9F@petdance.com>
http://postgresopen.org/2011/home/
Postgres Open features use cases, latest developments in the open
source database PostgreSQL, and a variety of speakers who will talk
about applications, database performance, and the current state of the
database market. Many of the speakers and attendees are Oracle, MS
SQL, Informix and MySQL DBAs who have recently converted to
PostgreSQL.
Our schedule is up at: http://postgresopen.org/2011/schedule/
We're also trying to bring Postgres *to* an existing open source and
database community in Chicago, and connect deeply with folks who
already use Postgres but maybe aren't in touch with key members of the
development community.
Our conference is a non-profit, backed by a 501(c)3, and has a program
committee made up of core PostgreSQL community members, experienced
speakers and myself.
We chose our city based on the number of books related to Postgres
that were sold there. Austin and Chicago are the two places that have
sold the most books, but have never had a Postgres conference located
there.
We'd love to see you at our conference. We're offering a $150 discount
for user groups:
http://postgresopen.org/2011/tickets/
Enter the following discount code: PUGLUV
Feel free to pass the code along to others in the local community.
Thanks so much, and hope to see you there.
-selena
--
Andy Lester => andy at petdance.com => www.petdance.com => AIM:petdance
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From sean at blanton.com Fri Sep 9 08:41:21 2011
From: sean at blanton.com (Sean Blanton)
Date: Fri, 9 Sep 2011 11:41:21 -0400
Subject: [Chicago-talk] Mongo Chicago 2011
In-Reply-To: <79D22505-FC54-4AB3-B5FB-AA60AD2151C2@petdance.com>
References: <79D22505-FC54-4AB3-B5FB-AA60AD2151C2@petdance.com>
Message-ID:
Thanks, Seth
Regards,
Sean
On Fri, Sep 9, 2011 at 10:44 AM, Andy Lester wrote:
> Hello everyone,
>
> Many of you are probably aware of this but I thought I'd send out a
> reminder in case you missed it. 10gen's Mongo Chicago 2011 is happening on
> October 18th. Registration is $50 if you register before September 20th.
> Proposals for presentations are accepted through September 15th. More
> details are available on 10gen's site
> .
>
> Hope to see you there,
> Seth
> --
> Please Note: If you hit "*REPLY*", your message will be sent to *everyone*on this mailing list (
> Chicago-MongoDB-User-Group-list at meetup.com)
> This message was sent by Seth Mabbott (seth.mabbott at gmail.com) from Chicago
> MongoDB User Group .
> To learn more about Seth Mabbott, visit his/her member profile
>
>
> Meetup, PO Box 4668 #37895 New York, New York 10163-4668 |
> support at meetup.com
>
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From richard at rushlogistics.com Sun Sep 11 13:14:28 2011
From: richard at rushlogistics.com (richard at rushlogistics.com)
Date: Sun, 11 Sep 2011 20:14:28 +0000
Subject: [Chicago-talk] Spliting an up undelimited file
Message-ID: <1002890549-1315772071-cardhu_decombobulator_blackberry.rim.net-930738286-@b17.c5.bise6.blackberry>
I have a text file that I need to split up so I can put it into a database. However, it isn't exactly delimited. The structure is as follows:
March 1, 2006 Few interruptions. Operations proceed as planed.
March 2, 2006 Delays due to bad weather and worker absences.
March 3, 2006 Significant progress. Few absences reported and agreeable weather.
I want to split it up into two scalars: date and event description however since it's not delimited I'm not sure how to go about this. Any suggestions appreciated.
Watch our 3 minute movie: http://www.rushlogistics.com/movie
From jim at jimandkoka.com Sun Sep 11 13:19:25 2011
From: jim at jimandkoka.com (Jim Thomason)
Date: Sun, 11 Sep 2011 15:19:25 -0500
Subject: [Chicago-talk] Spliting an up undelimited file
In-Reply-To: <1002890549-1315772071-cardhu_decombobulator_blackberry.rim.net-930738286-@b17.c5.bise6.blackberry>
References: <1002890549-1315772071-cardhu_decombobulator_blackberry.rim.net-930738286-@b17.c5.bise6.blackberry>
Message-ID:
On Sun, Sep 11, 2011 at 3:14 PM, wrote:
> I have a text file that I need to split up so I can put it into a database. However, it isn't exactly delimited. The structure is as follows:
>
> March 1, 2006 Few interruptions. Operations proceed as planed.
> March 2, 2006 Delays due to bad weather and worker absences.
> March 3, 2006 Significant progress. Few absences reported and agreeable weather.
>
> I want to split it up into two scalars: date and event description however since it's not delimited I'm not sure how to go about this. Any suggestions appreciated.
This still looks rigidly structured - "date" "space" "run of text"
while (<>) {
if (/(\w+ \d+, \d{4}) (.+)/) {
my ($date, $memo) = ($1, $2);
#do something interesting with $date and $memo
}
}
or something to that effect. Be more or less paranoid about the format
of the month, date, and year as desired.
-Jim.....
From tigerpeng2001 at yahoo.com Mon Sep 12 07:03:25 2011
From: tigerpeng2001 at yahoo.com (tiger peng)
Date: Mon, 12 Sep 2011 07:03:25 -0700 (PDT)
Subject: [Chicago-talk] Spliting an up undelimited file
In-Reply-To: <1002890549-1315772071-cardhu_decombobulator_blackberry.rim.net-930738286-@b17.c5.bise6.blackberry>
References: <1002890549-1315772071-cardhu_decombobulator_blackberry.rim.net-930738286-@b17.c5.bise6.blackberry>
Message-ID: <1315836205.81348.YahooMailNeo@web120528.mail.ne1.yahoo.com>
Are the date format(s) known? Are there limited event description? Can post some (makeup) sample data?
________________________________
From: "richard at rushlogistics.com"
To: chicago-talk at pm.org
Sent: Sunday, September 11, 2011 3:14 PM
Subject: [Chicago-talk] Spliting an up undelimited file
I have a text file that I need to split up so I can put it into a database. However, it isn't exactly delimited. The structure is as follows:
March 1, 2006 Few interruptions. Operations proceed as planed.
March 2, 2006 Delays due to bad weather and worker absences.
March 3, 2006 Significant progress. Few absences reported and agreeable weather.
I want to split it up into two scalars: date and event description however since it's not delimited I'm not sure how to go about this. Any suggestions appreciated.
Watch our 3 minute movie: http://www.rushlogistics.com/movie
_______________________________________________
Chicago-talk mailing list
Chicago-talk at pm.org
http://mail.pm.org/mailman/listinfo/chicago-talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From Andy_Bach at wiwb.uscourts.gov Mon Sep 12 07:20:01 2011
From: Andy_Bach at wiwb.uscourts.gov (Andy_Bach at wiwb.uscourts.gov)
Date: Mon, 12 Sep 2011 09:20:01 -0500
Subject: [Chicago-talk] Spliting an up undelimited file
In-Reply-To:
References: <1002890549-1315772071-cardhu_decombobulator_blackberry.rim.net-930738286-@b17.c5.bise6.blackberry>
Message-ID:
> This still looks rigidly structured - "date" "space" "run of text"
while (<>) {
if (/(\w+ \d+, \d{4}) (.+)/) {
my ($date, $memo) = ($1, $2);
#do something interesting with $date and $memo
}
}
Yeah, and just to be safe, use whitespace metas, (and /x - "readability")
to get:
if (/(\w+ \s+ \d+, \s+ \d+) \s +(.+)/x) {
if there's a chance for variability, as w/ those logs that outdent the
single digit date number
March 8
March 9
March 10
and add and 'else' if you want to worry about bad data.
a
----------------------
Andy Bach
Systems Mangler
Internet: andy_bach at wiwb.uscourts.gov
Voice: (608) 261-5738, Cell: (608) 658-1890
?One of the most striking differences between a cat and a lie is that a cat
has only nine lives.?
Mark Twain, Vice President, American Anti-Imperialist League, and erstwhile
writer
From vjcang at gmail.com Wed Sep 14 01:50:14 2011
From: vjcang at gmail.com (Vijay Kumar)
Date: Wed, 14 Sep 2011 04:50:14 -0400
Subject: [Chicago-talk] Simulating "Save Link As" in Perl
Message-ID:
Hi,
When I access below $binaryfile_url (some url pointing to a binary file)
from a web browser, I get HTTP Error 404.
However, I can save the binary file to my hard disk by right clicking the
url and selectung 'Save Link As'.
Now, when I try this with LWP::Simple
my $status=getstore($binaryfile_url, $download_file_fullpath);
it fails with the same 404 error.
I want to simulate the 'Save Link As' behavior of the web browser to
download it programmatically from Perl. Any ideas?
Thanks a lot
VIJAY
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From vjcang at gmail.com Wed Sep 14 02:14:26 2011
From: vjcang at gmail.com (Vijay Kumar)
Date: Wed, 14 Sep 2011 05:14:26 -0400
Subject: [Chicago-talk] Simulating "Save Link As" in Perl
In-Reply-To:
References:
Message-ID:
Apologies. Please ignore this mail.
I did a mistake in my testing. It works.
Thanks
VIJAY
On 14 September 2011 04:50, Vijay Kumar wrote:
> Hi,
>
> When I access below $binaryfile_url (some url pointing to a binary file)
> from a web browser, I get HTTP Error 404.
> However, I can save the binary file to my hard disk by right clicking the
> url and selectung 'Save Link As'.
>
> Now, when I try this with LWP::Simple
> my $status=getstore($binaryfile_url, $download_file_fullpath);
> it fails with the same 404 error.
>
> I want to simulate the 'Save Link As' behavior of the web browser to
> download it programmatically from Perl. Any ideas?
>
> Thanks a lot
> VIJAY
>
--
VIJAY
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From michael at potter.name Thu Sep 15 08:18:16 2011
From: michael at potter.name (Michael Potter)
Date: Thu, 15 Sep 2011 11:18:16 -0400
Subject: [Chicago-talk] Mechanical Turk
Message-ID:
Perl Crew,
I have been called upon to try to do "OCR" on handwriting.
In particular, I need to convert a hand written name to ascii. I could
provide a small .tif with just the name in it.
It came to mind that this might be a good use of mechanical turk.
I am sending this to the perl list because I seem to recall some of the
Mongers have worked with mechanical turk.
Here are my specific questions:
1) how long is typical turn around for a response?
2) Is this a reasonable task for Mechanical Turk.
I looked at the amazon website for HITs similar to what I am trying to do.
I did not find any, but I question my ability to search completely. The
closest I found was business card transcription.
You comments welcome.
--
Michael Potter
Replatform Technologies, LLC
+1 770 815 6142
michael at potter.name
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From joel.a.berger at gmail.com Thu Sep 15 12:26:13 2011
From: joel.a.berger at gmail.com (Joel Berger)
Date: Thu, 15 Sep 2011 14:26:13 -0500
Subject: [Chicago-talk] Mechanical Turk
In-Reply-To:
References:
Message-ID:
Have you tried OCRing programmatically?
http://search.cpan.org/search?mode=all&query=ocr
How have the results been? It seems that if you could eliminate the
easy ones and perhaps only shift the problematic ones to mTurk that
would be cheaper.
Joel
On Thu, Sep 15, 2011 at 10:18 AM, Michael Potter wrote:
> Perl Crew,
> I have been called upon to try to do "OCR" on handwriting.
> In particular, I need to convert a hand written name to ascii. ?I could
> provide a small .tif with just the name in it.
> It came to mind that this might be a good use of mechanical turk.
> I am sending this to the perl list because I seem to recall some of the
> Mongers have worked with mechanical turk.
> Here are my specific questions:
> 1) how long is typical turn around for a response?
> 2) Is this a reasonable task for Mechanical Turk.
> I looked at the amazon website for HITs similar to what I am trying to do.
> ?I did not find any, but I question my ability to search completely. ?The
> closest I found was business card transcription.
> You comments welcome.
> --
> Michael Potter
> Replatform Technologies, LLC
> +1 770 815 6142
> michael at potter.name
>
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk
>
From michael at potter.name Thu Sep 15 13:01:16 2011
From: michael at potter.name (Michael Potter)
Date: Thu, 15 Sep 2011 16:01:16 -0400
Subject: [Chicago-talk] Mechanical Turk
In-Reply-To:
References:
Message-ID:
yes, we are using tesseract-3.00 for OCR of the computer printed text.
We are going to try to get the tesseract trained to do hand written block
letters, but I am not holding out a lot of hope that it will work with.
I am researching the next best option which might be the mechanical turk.
On Thu, Sep 15, 2011 at 3:26 PM, Joel Berger wrote:
> Have you tried OCRing programmatically?
> http://search.cpan.org/search?mode=all&query=ocr
>
> How have the results been? It seems that if you could eliminate the
> easy ones and perhaps only shift the problematic ones to mTurk that
> would be cheaper.
>
> Joel
>
> On Thu, Sep 15, 2011 at 10:18 AM, Michael Potter
> wrote:
> > Perl Crew,
> > I have been called upon to try to do "OCR" on handwriting.
> > In particular, I need to convert a hand written name to ascii. I could
> > provide a small .tif with just the name in it.
> > It came to mind that this might be a good use of mechanical turk.
> > I am sending this to the perl list because I seem to recall some of the
> > Mongers have worked with mechanical turk.
> > Here are my specific questions:
> > 1) how long is typical turn around for a response?
> > 2) Is this a reasonable task for Mechanical Turk.
> > I looked at the amazon website for HITs similar to what I am trying to
> do.
> > I did not find any, but I question my ability to search completely. The
> > closest I found was business card transcription.
> > You comments welcome.
> > --
> > Michael Potter
> > Replatform Technologies, LLC
> > +1 770 815 6142
> > michael at potter.name
> >
> > _______________________________________________
> > Chicago-talk mailing list
> > Chicago-talk at pm.org
> > http://mail.pm.org/mailman/listinfo/chicago-talk
> >
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk
>
--
Michael Potter
Replatform Technologies, LLC
+1 770 815 6142
michael at potter.name
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From michael at potter.name Thu Sep 15 14:39:19 2011
From: michael at potter.name (Michael Potter)
Date: Thu, 15 Sep 2011 17:39:19 -0400
Subject: [Chicago-talk] Mechanical Turk
In-Reply-To:
References:
Message-ID:
Here are a couple more comments:
Errors are not a big deal.
We already deal with typos in names all the time.
To check, I think I would run twice, if they did not match significantly,
run a third time.
The names are not sensitive. The stranger would know that somewhere in the
world a person lived named "Ruth Smith". Not a big deal. If at some time
in the future someone decides that it is a big deal I will run a HIT for
first name and at HIT for last name.
Anyone know the trick to embedding the image in the HIT?
>From what I read I need to provide a url to the image, but I would rather
have the image embedded in the request. Seems easier to control security.
On Thu, Sep 15, 2011 at 4:01 PM, Michael Potter wrote:
> yes, we are using tesseract-3.00 for OCR of the computer printed text.
>
> We are going to try to get the tesseract trained to do hand written block
> letters, but I am not holding out a lot of hope that it will work with.
>
> I am researching the next best option which might be the mechanical turk.
>
>
> On Thu, Sep 15, 2011 at 3:26 PM, Joel Berger wrote:
>
>> Have you tried OCRing programmatically?
>> http://search.cpan.org/search?mode=all&query=ocr
>>
>> How have the results been? It seems that if you could eliminate the
>> easy ones and perhaps only shift the problematic ones to mTurk that
>> would be cheaper.
>>
>> Joel
>>
>> On Thu, Sep 15, 2011 at 10:18 AM, Michael Potter
>> wrote:
>> > Perl Crew,
>> > I have been called upon to try to do "OCR" on handwriting.
>> > In particular, I need to convert a hand written name to ascii. I could
>> > provide a small .tif with just the name in it.
>> > It came to mind that this might be a good use of mechanical turk.
>> > I am sending this to the perl list because I seem to recall some of the
>> > Mongers have worked with mechanical turk.
>> > Here are my specific questions:
>> > 1) how long is typical turn around for a response?
>> > 2) Is this a reasonable task for Mechanical Turk.
>> > I looked at the amazon website for HITs similar to what I am trying to
>> do.
>> > I did not find any, but I question my ability to search completely.
>> The
>> > closest I found was business card transcription.
>> > You comments welcome.
>> > --
>> > Michael Potter
>> > Replatform Technologies, LLC
>> > +1 770 815 6142
>> > michael at potter.name
>> >
>> > _______________________________________________
>> > Chicago-talk mailing list
>> > Chicago-talk at pm.org
>> > http://mail.pm.org/mailman/listinfo/chicago-talk
>> >
>> _______________________________________________
>> Chicago-talk mailing list
>> Chicago-talk at pm.org
>> http://mail.pm.org/mailman/listinfo/chicago-talk
>>
>
>
>
> --
> Michael Potter
> Replatform Technologies, LLC
> +1 770 815 6142
> michael at potter.name
>
--
Michael Potter
Replatform Technologies, LLC
+1 770 815 6142
michael at potter.name
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From selenamarie at gmail.com Sun Sep 18 20:03:15 2011
From: selenamarie at gmail.com (Selena Deckelmann)
Date: Sun, 18 Sep 2011 22:03:15 -0500
Subject: [Chicago-talk] Slides from PostgreSQL 9.1 talk
In-Reply-To:
References:
Message-ID:
Hello Perlmongers!
Thanks for hosting me at BofA in Chicago last week. Stephen Frost,
David Wheeler and I had a blast.
Here's a shortlink: http://chesnok.com/u/4U
I made one mistake in the discussion that I can correct now - unlogged
tables *are* preserved after a clean shutdown as of 9.1 release.
Stephen and I discussed the issue with the author of the feature
during Postgres Open, and he let us know that a long discussion
happened about what the preferred default behavior should be, and
those who thought clean shutdown *should not cause a truncate*
prevailed.
The slides are updated to reflect that.
Thanks again!
-selena
--
http://chesnok.com
--
http://chesnok.com
From richard at rushlogistics.com Wed Sep 21 17:28:59 2011
From: richard at rushlogistics.com (Richard Reina)
Date: Wed, 21 Sep 2011 20:28:59 -0400 (EDT)
Subject: [Chicago-talk] data mining
Message-ID: <20110922002859.A20EA611@swiftsure.xo.com>
I am hoping to create a US geography database comprised of information about US towns and cities. I was able to get the populations of all incorporated towns from a US census file into a table. However, I am hoping to create a table with facts or (trivia) about the towns themselves (large and small) -- when they were founded, what they're known for (if anything). Writing a perl program that uses regex to search the web is probably beyond my skills. However, I am wondering if such a thing is possible and if so how hard or easy it is? I looked on CPAN but confess I really am not sure if I would recognise what I need if I saw it. Any ideas on how or IF this can be done would be greatly appreciated.
Thanks,
Richard
--
Richard Reina
Rush Logistics, Inc.
Watch our 3 minute movie:
http://www.rushlogistics.com/movie
From don at drakeconsulting.com Wed Sep 21 17:49:51 2011
From: don at drakeconsulting.com (Don Drake)
Date: Wed, 21 Sep 2011 19:49:51 -0500
Subject: [Chicago-talk] data mining
In-Reply-To: <20110922002859.A20EA611@swiftsure.xo.com>
References: <20110922002859.A20EA611@swiftsure.xo.com>
Message-ID: <97F7107D-B647-409A-9A22-381659F97813@drakeconsulting.com>
I would like here for data:
https://simplegeo.com/products/context/#11.00/41.8639/-87.6091
or
http://www.factual.com
And use their API's to get the data you need.
-Don
--
Don Drake
www.drakeconsulting.com
www.maillaunder.com
312-560-1574
800-733-2143
On Sep 21, 2011, at 7:28 PM, Richard Reina wrote:
> I am hoping to create a US geography database comprised of information about US towns and cities. I was able to get the populations of all incorporated towns from a US census file into a table. However, I am hoping to create a table with facts or (trivia) about the towns themselves (large and small) -- when they were founded, what they're known for (if anything). Writing a perl program that uses regex to search the web is probably beyond my skills. However, I am wondering if such a thing is possible and if so how hard or easy it is? I looked on CPAN but confess I really am not sure if I would recognise what I need if I saw it. Any ideas on how or IF this can be done would be greatly appreciated.
>
> Thanks,
>
> Richard
> --
> Richard Reina
> Rush Logistics, Inc.
> Watch our 3 minute movie:
> http://www.rushlogistics.com/movie
>
> _______________________________________________
> Chicago-talk mailing list
> Chicago-talk at pm.org
> http://mail.pm.org/mailman/listinfo/chicago-talk
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From dcmertens.perl at gmail.com Thu Sep 22 06:40:30 2011
From: dcmertens.perl at gmail.com (David Mertens)
Date: Thu, 22 Sep 2011 08:40:30 -0500
Subject: [Chicago-talk] data mining
In-Reply-To: <20110922002859.A20EA611@swiftsure.xo.com>
References: <20110922002859.A20EA611@swiftsure.xo.com>
Message-ID:
Richard, you said:
> Writing a perl program that uses regex to search the web is probably beyond my skills.
Can you elaborate on that? To what would you apply such a regex? Were
you thinking about doing an all-out web crawl, then parsing the output
to find relevant information about a given city? Are you looking for
an existing database that you can tap? (If the latter, Don's
suggestions look pretty good. See p3rl.org/Geo::Coder::SimpleGeo or
p3rl.org/Net::HTTP::Factual.)
If you want an all-out web crawl, generating your data from the web
from scratch, I can imagine putting something together with
WWW::Mechanize to do the crawl. Determining which page has
relevant---and authoritative---information about a city, and then
managing to extract that information, can get very complicated. What
are your resources? What is your timeline? What is your expertise in
data mining?
David