Note from archiver<at>cs.uu.nl:
This page is part of a big collection
of Usenet postings, archived here for your convenience.
For matters concerning the content of this page,
please contact its author(s); use the
source, if all else fails.
For matters concerning the archive as a whole, please refer to the
archive description
or contact the archiver.

Archive-name: ai-faq/general/part7
Posting-Frequency: monthly
Last-Modified: Fri Mar 19 13:37:08 PST 1999 by Ric Crabbe
Version: 2.0
Maintainer: Ric Crabbe <cardo@cs.ucla.edu> and Amit Dubey <adubey@netscape.net>
URL: ftp://ftp.cs.ucla.edu/pub/AI/ai_7.faq
Size: 84098 bytes, 1871 lines
Part 7: (FTP Resources):
[7-1] AI Bibliographies available by FTP and WWW
[7-2] Technical Reports available by FTP and WWW
[7-3] Where can I get a machine readable dictionary, thesaurus, and
other text corpora?
[7-4] List of Smalltalk implementations.
[7-5] AI-related CD-ROMs
Subject: [7-1] AI Bibliographies available by FTP
AI:
The Computer Science Department at the University of Saarbruecken, Germany,
maintains a large bibliographic database of articles pertaining to the
field of Artificial Intelligence. Currently the database contains more
than 25,000 references, which can be retrieved by electronic mail from
the LIDO mailserver at lido@cs.uni-sb.de. Send a mail message with
subject line "lidosearch help info" to get instructions on using the
mail server. A variety of queries based on author names, title and
year of publication are possible. The references can be provided in
BibTeX or Refer formats. The entire bibliographic database can be
obtained for a fee by ftp or on tape. Questions may be directed to
bib-1@cs.uni-sb.de.
A variety of AI-related bibliographies are available by anonymous ftp
from
nexus.yorku.ca:/pub/bibliographies/
[Maintainer's note: nexus doesn't seem to be accepting anonymous logins]
[anymore. Does anyone have a new link?]
Stanford University (SUMEX-AIM) has a large BibTeX bibliography of
Artificial Intelligence papers and technical reports. Available by
anonymous ftp from aim.stanford.edu:/pub/ai{1,2,3}.bib
[Maintainer's note: this one doesn't seem to be working, either]
A large collection of BibTeX bibliographies (290,000+ references) on a
variety of subjects, including artificial intelligence (29,402
entries), neural networks (8,111 entries), and object-oriented
programming (3,493 entries), is available by anonymous ftp from
ftp.ira.uka.de:/pub/bibliography/ [129.13.10.90]
and in the mirror sites
faui80.informatik.uni-erlangen.de:/pub/literatur/Mirror/bibliography/
ftp.cs.umanitoba.ca:/pub/bibliographies/
or by WWW from
ftp://ftp.ira.uka.de/pub/bibliography/index.htmlhttp://www.ira.uka.de/ftp/ira/bibliography/index.html
Some of the bibliographies prohibit commercial use. For more
information, see the README file, or write to Alf-Christian Achilles
<bibservadmin@ira.uka.de> or <achilles@ira.uka.de>.
Glimpse, a searchable interface to the UKA and other
bibliographies, is accessible as
http://glimpse.cs.arizona.edu:1994/bib/
Write to glimpse@cs.arizone.edu for more information.
OFAI Library Bibliography, in Austria
http://www.ai.univie.ac.at/biblio.html
Fuzzy Logic:
A BibTeX database of references addressing neuro-fuzzy issues can be
obtained by anonymous ftp from
ftp.tu-bs.de:/local/papers/ [134.169.34.15]
as the (ascii) file fuzzy-nn.bib.
Genetic Algorithms:
A bibliography of over 400 Evolutionary Computation references (GA,
ES, EP, GP) is available by anonymous ftp from
magenta.me.fau.edu:/pub/ep-list/bib/
[Maintainer's note: this seems to be out-of-date]
The file EC-ref.bib.Z is in BibTeX format; EC-ref.ps.Z is a postscript
version of the bibliography. Please send additions and corrections to
saravan@amber.me.fau.edu or EP-List@amber.me.fau.edu.
Other Genetic Algorithm bibliography sites include:
ftp://ftp.uwasa.fi/cs/report94-1/http://www.cogs.susx.ac.uk/users/ezequiel/alife-page/alife.html
Logic Programming, Constraints:
A BibTeX bibliography for Constraint Logic Programming is available
by anonymous ftp from
archive.cis.ohio-state.edu:/pub/clp/
in the bib/ and papers/ subdirectories.
NLP/CL:
For information on a fairly complete bibliography of computational
linguistics and natural language processing work from the 1980s, send
mail to clbib@csli.stanford.edu with the subject HELP.
The CSLI linguistics bibliography contains 3,300 entries in
bib/tib/refer format. The bibliography is heavily slanted towards
phonetics and phonology but also includes a fair amount of
computational morphology, syntax, semantics, and psycholinguistics.
The bibliography can be used with James Alexander's tib
bibliography system, which is available from minos.inria.fr
[128.93.39.5] among other places. The bibliography itself is available
by anonymous ftp from
csli.stanford.edu:/pub/bibliography/
Contributions are welcome, but should be in tib format.
For more information, contact Andras Kornai <kornai@csli.stanford.edu>
NLG:
Robert Dale's Natural Language Generation (NLG) bibliography is
available by anonymous ftp from
scott.cogsci.ed.ac.uk:/pub/nlg/ [129.215.144.3]
Note that it is formatted for A4 paper. Stick in a line
.94 .94 scale
after the %! line to print on 8.5 x 11 paper. For further information,
write to Robert Dale, University of Edinburgh, Centre for Cognitive
Science, 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, or
<R.Dale@edinburgh.ac.uk> or <rdale@microsoft.com>.
Mark Kantrowitz's Natural Language Generation (NLG) bibliography is
available by anonymous ftp from
ftp.cs.cmu.edu:/user/ai/areas/nlp/nlg/bib/mk/ [128.2.206.173]
In addition to the tech report, the BibTeX file containing the
bibliography is also available. The bibliography contains more than
1,200 entries. A searchable index to the bibliography is
available via the URL
http://liinwww.ira.uka.de/bibliography/Ai/nlg.html
Additions and corrections should be sent to mkant@cs.cmu.edu.
Neural Nets, Learning:
A bibliography of over 1000 entries about Self-Organizing Map
(SOM) and Learning vector Quantization (LVQ) studies is
available by anonymous ftp from
cochlea.hut.fi:/pub/ref/
as the files references.bib.Z (BibTeX file) and references.ps.Z
(PostScript file). Please send additions and corrections to
biblio@cochlea.hut.fi.
An extensive collection of references on Principal Component Analysis
(PCA) neural networks and learning algorithms is available by
anonymous ftp from dendrite.hut.fi:/pub/ref/ in LaTeX and PostScript
formats. The list was compiled by Liu-Yue Wang, a graduate student of
Erkki Oja, and updated by Juha Karhunen, all from Helsinki University
of Technology, Finland. For more information, contact Erkki Oja
<oja@dendrite.hut.fi>.
A bibliography of PCA algorithms is available by anonymous ftp from
ftp.ai.mit.edu:/pub/sanger-papers/ as pca.bib. For more information,
contact Terry Sanger <tds@ai.mit.edu>.
A 36-page bibliography of connectionist models with symbolic
processing is available by anonymous ftp from Neuroprose
archive.cis.ohio-state.edu:/pub/neuroprose/ [128.146.8.52]
as the file sun.nn-sp-bib.ps.Z. For more information, contact
Ron Sun <rsun@athos.cs.ua.edu>.
Nonmonotonic Logic, Belief Revision:
A bibliography on belief revision and nonmonotonic logics with
about 2,000 items is available by anonymous ftp from
tarski.phil.indiana.edu:/pub/morado/ [129.79.134.34]
as nonmono.bib or nonmono.bib.Z. The file is also available by WAIS as
wais://tarski.phil.indiana.edu/nonmono.bib?
and by gopher/WWW. Please send additions and corrections to Raymundo
Morado <morado@phil.indiana.edu>.
Speech:
A bibliography of papers on Silicon Auditory Models (VLSI
implementations of auditory representations) is available by anonymous
ftp from
hobiecat.pcmp.caltech.edu:/pub/anaprose/lazzaro/sa-biblio.ps.Z
For more information, write to John Lazzaro <lazzaro@boom.cs.berkeley.edu>
----------------------------------------------------------------
Subject: [7-2] Technical Reports available by FTP
This section lists the anonymous ftp sites for technical reports from
several universities and other organizations. Some of the sites
provide only an online catalog of technical reports, while the rest
make the actual reports available online. The email address listed is
that of the appropriate person to contact with questions about
ordering technical reports.
When ftping compressed .Z files, remember to set the transfer type to
binary first, using the command
ftp> binary
Other general locations for technical reports from several
universities include:
wuarchive.wustl.edu:/doc/techreports/ [128.252.135.4]
cs-archive.uwaterloo.ca:/cs-archive/ (see Index for an index)
AKA watdragon.uwaterloo.ca [129.97.140.24]
The uwaterloo archive includes tech reports from the Logic Programming
and Artificial Intelligence Group (LPAIG) of the University of Waterloo.
There is also a WAIS server containing tech report abstracts that can be
searched. To use, create the file ~/wais-sources/cs-techreport-abstracts.src
containing
(:source
:version 3
:ip-address "130.194.74.201"
:ip-name "daneel.rdt.monash.edu.au"
:tcp-port 210
:database-name "cs-techreport-abstracts"
:cost 0.00
:cost-unit :free
:maintainer "wais@daneel.rdt.monash.edu.au")
and invoke your local wais client. To add to it, email abstracts of
your papers to wais@rdt.monash.edu.au in the following format:
%TI Title
%AU Author (use multiple %AU lines for multiple authors)
%PU Published In (citation information)
%AV Availability (e.g., ftp reports.adm.cs.cmu.edu:/1992/CMU-CS-92-101.ps)
%OR Organization (see cs-techreport-archives.src for institution codes)
%LT Local title (e.g., tech report number)
%DA Date (and, if you want, %MN Month, %YR Year)
%AB Abstract
If your papers are not available by FTP, you can use a %AV line such as:
%AV mail harry.bovik@cs.cmu.edu
Further instructions are available from
daneel.rdt.monash.edu.au:/pub/techreports/reports/README
[Based on a post by Ashwin Ram.]
Also see the Unified Computer Science Technical Report Index
http://cs.indiana.edu/cstr/search
A list of FTP sites for technical reports and papers can be found in
http://www.rdt.monash.edu.au/tr/siteslist.html
A list of more than 230 sites publishing CS tech reports may be
obtained by anonymous ftp from
ftp.rdt.monash.edu.au:/pub/techreports/sites/sites-list-data
To receive notification of new tech report sites, send mail to
compdoc-techreports-request@ftp.cse.ucsc.edu to join the mailing list.
An archive of linguistics papers and preprints is available from
linguistics.archive.umich.edu:/linguistics/papers/. Contact John Lawler
(jlawler@umich.edu) or linguistics-archivist@umich.edu for more
information.
The Concurrent Engineering Research Center (CERC) at West Virginia
University has placed ASCII versions of the concurrent
engineering-related abstracts (over 500) that were on CERCnet, ASCII
back issues of the Concurrent Engineering Research in Review journal
(now discontinued), and Postscript copies of CERC technical reports in
the gopher server gopher.cerc.wvu.edu. In addition, many of the CERC
technical reports, including journal articles, symposium papers,
theses, dissertations, and issues of the Concurrent Engineering
Research in Review journal, are available as Postscript versions via
anonymous ftp from
babcock.cerc.wvu.edu:/pub/techReports/ [157.182.44.36]
An index to all the reports, including some that are
available only in hardcopy, is contained in the file "CERC-TR-INDEX".
If you need additional information, contact Mary Carriger, CERC Office
of Information Services, at carriger@cerc.wvu.edu.
The newsgroup comp.doc.techreports is devoted to distributing lists of
tech reports and their abstracts.
MIT Artificial Intelligence Laboratory:
ftp -- publications.ai.mit.edu:/ai-publications/
email -- publications@ai.mit.edu
browse -- telnet reading-room.lcs.mit.edu
www -- www.ai.mit.edu/pubs.html
A full catalog of MIT AI Lab technical reports (and a listing of recent
updates) may be obtained from the above location, by writing to
Publications, Room NE43-818, M.I.T. Artificial Intelligence Laboratory,
545 Technology Square, Cambridge, MA 02139, USA, or by calling
1-617-253-6773. The catalog lists the technical reports ("AI Memos")
with a short abstract and their current prices. There is also a charge
for shipping. Some recent tech reports (since 1991) are available in the
ai-publications/ subdirectory; older technical reports are NOT
available by ftp. A bibliography is in the bibliography/ directory.
CMU School of Computer Science:
ftp -- reports.adm.cs.cmu.edu
email -- Technical.Reports@cs.cmu.edu
www -- reports-archive.adm.cs.cmu.edu/cs.html
CMU Software Engineering Institute:
ftp -- ftp.sei.cmu.edu:/pub/documents
email -- bjz@sei.cmu.edu
www -- www.sei.cmu.edu/publications/publication.html
Yale:
ftp -- dept.cs.yale.edu:/pub/TR/
University of Washington CSE Tech Reports:
ftp -- june.cs.washington.edu:/tr
email -- tr-request@cs.washington.edu
================
AT&T Bell Laboratories:
ftp -- netlib.att.com:/netlib/research/cstr/
bib.Z contains short bibliography, including all the technical
reports contained in this directory.
ftp -- research.att.com:/dist/ai
[Maintainer's note: I assume these have been moved over to Lucent's
domain?]
Argonne National Laboratory:
ftp -- anagram.mcs.anl.gov:/pub/tech_reports
email -- wright@mcs.anl.gov
Contains MCS Division preprints and technical memoranda,
available as either .dvi or .ps files. For descriptions of the
contents, see the subdirectory pub/tech_reports/abstracts; for
the files themselves see the subdirectory pub/tech_reports/reports.
Boston University:
ftp -- cs.bu.edu:/techreports/
email -- techreports@cs.bu.edu
Brown University:
ftp -- wilma.cs.brown.edu:/techreports/
email -- techreports@cs.brown.edu
Cambridge University: Speech, Vision & Robotics Group
ftp -- svr-ftp.eng.cam.ac.uk:/reports/
Columbia University:
ftp -- cs.columbia.edu:/pub/reports
email -- tech-reports@cs.columbia.edu
DEC Cambridge Research Lab:
ftp -- crl.dec.com:/pub/DEC/CRL/abstracts/
crl.dec.com:/pub/DEC/CRL/tech-reports/
DEC Paris Research Lab:
email -- doc-server@prl.dec.com
Put commands in Subject: line of the message.
To get a list of articles, use
send index articles
To get a list of tech reports, use
send index reports
DEC WRL:
email -- wrl-techreports@decwrl.dec.com
To get a helpfile, send a message with
help
in the subject line.
DFKI:
ftp -- duck.dfki.uni-sb.de:/pub/papers
email -- Martin Henz (henz@dfki.uni-sb.de)
Duke University:
ftp -- cs.duke.edu:/dist/papers/
cs.duke.edu:/dist/theses/
email -- techreport@cs.duke.edu [unknown user, 7/7/93]
Edinburgh:
A list of available reports can be sent via email. Send requests
for information about reports from the Center for Cognitive Science
to cogsci%ed.ac.uk@nsfnet-relay.ac.uk, and from the Human Communication
Research Center to HCRC%ed.ac.uk@nsfnet-relay.ac.uk.
Electrotechnical Laboratory, Japan:
Reports from the Cooperative Architecture project (half AI, half
software engineering).
ftp -- etlport.etl.go.jp:/pub/kyocho/Papers [192.31.197.99]
See file Index.English.
email -- Hideyuki Nakashima <nakashim@etl.go.jp>.
Georgia Tech College of Computing, AI Group:
ftp -- ftp.cc.gatech.edu:/pub/ai (130.207.3.245)
email -- Professor Ashwin Ram <ashwin@cc.gatech.edu>
HCRC (Human Communication Research Centre):
ftp -- scott.cogsci.ed.ac.uk:/pub/HCRC-papers/
mail -- Fiona-Anne Malcolm
Human Communication Research Centre
2 Buccleuch Place, Edinburgh, UK
Illinois:
email -- Erna Amerman <erna@uiuc.edu>
Illinois Genetic Algorithms Laboratory (IlliGAL):
email -- Eric Thompson <library@gal1.ge.uiuc.edu>
phone -- 217-333-2346 (9AM to 5PM CT, M-F)
mail -- Illinois Genetic Algorithms Laboratory
Department of General Engineering
117 Transportation Building
104 South Mathews Avenue
Urbana, IL 61801-2996
ftp -- gal4.ge.uiuc.edu:/pub/papers/IlliGALs/
Includes the GA bibliography and the Messy GA code in C
(in /pub/src/) and preprints (in /pub/papers/Publications)
www -- http://gal4.ge.uiuc.edu/illigal.home.html
Indiana:
ftp -- cogsci.indiana.edu:/pub [129.79.238.12]
ftp -- ftp.cs.indiana.edu:/pub/techreports [129.79.254.191]
INRIA, France:
ftp -- ftp.inria.fr:/INRIA/publication/
Institute for Learning Sciences at Northwestern University:
ftp -- aristotle.ils.nwu.edu:/pub/papers/
phone -- 708-491-3500
Mechanized Reasoning Group (MRG):
ftp -- ftp.mrg.dist.unige.it:/pub/mrg-ftp
email -- Fausto Giunchiglia <fausto@irst.it>
Mechanized Reasoning Group, IRST
38050 Povo Trento, Italy
Tel: +39 461-314444 (secr.)
+39 461-314436 (office)
Fax: +39 461-302040 / 314591
National University of Singapore:
ftp -- ftp.nus.sg:/pub/NUS/ISCS/techreports
New York University (NYU):
ftp -- cs.nyu.edu:/pub/tech-reports
OGI:
ftp -- cse.ogi.edu:/pub/tech-reports
email -- csedept@cse.ogi.edu
Ohio State University, Laboratory for AI Research
ftp -- nervous.cis.ohio-state.edu:/pub/papers
email -- lair-librarian@cis.ohio-state.edu
OSU Neuroprose:
ftp -- archive.cis.ohio-state.edu:/pub/neuroprose (128.146.8.52)
This directory contains technical reports as a public service to the
connectionist and neural network scientific community which has an
organized mailing list (for info: connectionists-request@cs.cmu.edu)
Includes several bibliographies.
Stanford:
ftp -- elib.stanford.edu:/cs
Very spotty collection.
SRI:
email -- Donna O'Neal, donna@ai.sri.com
SUNY Buffalo:
ftp -- ftp.cs.buffalo.edu:/pub/tech-reports/
SUNY at Stony Brook:
ftp -- sbcs.sunysb.edu:/pub/TechReports
email -- rick@cs.sunysb.edu or stark@cs.sunysb.edu
The /pub/sunysb directory contains the SB-Prolog implementation
of the Prolog language. Contact warren@sbcs.sunysb.edu for more
information.
TCGA (The Clearinghouse for Genetic Algorithms):
email -- Robert Elliott Smith <rob@comec4.mh.ua.edu>
Department of Engineering of Mechanics
Room 210 Hardaway Hall
The University of Alabama
PO Box 870278
Tuscaloosa, AL 35487
205-348-1618, fax 205-348-6419
Thinking Machines:
ftp -- ftp.think.com:/think/techreport.list
This file contains a list of Thinking Machines technical reports.
Orders may be placed by email (limit 5) to t-rex@think.com, or by US
Mail to Thinking Machines Corporation, Attn: Technical reports, 245
First Street, Cambridge, MA 01241. In addition, the directories
cm/starlisp and cm/starlogo contain code for the *Lisp and *Logo
simulators.
Tulane University:
ftp -- rex.cs.tulane.edu:/pub/tech/ [129.81.132.1]
University of Alabama:
ftp -- aramis.cs.ua.edu:/pub/tech-reports/
University of Arizona:
ftp -- cs.arizona.edu:/reports/
email -- tr_libr@cs.arizona.edu
The directory /japan/kahaner.reports contains reports on AI in
Japan, among other things, written by Dr. David Kahaner, a
numerical analyst on sabbatical to the Office of Naval
Research-Asia (ONR Asia) in Tokyo from NIST. The reports are not
written in any sort of official capacity, but are quite interesting.
University of California/Los Angeles:
ftp -- ftp.cs.ucla.edu:/tech-report/
University of California/Santa Cruz:
ftp -- ftp.cse.ucsc.edu:/pub/bib/
ftp.cse.ucsc.edu:/pub/tr/
email -- jean@cs.ucsc.edu
University of Cambridge Computer Lab:
email -- tech-reports@cl.cam.ac.uk
University of Colorado:
ftp -- ftp.cs.colorado.edu:/pub/cs/techreports
University of Florida:
ftp -- bikini.cis.ufl.edu:/cis/tech-reports
University of Genoa, Mechanized Reasoning Group:
ftp -- ftp.mrg.dist.unige.it:/pub/mrg-ftp/
email -- Fausto Giunchiglia <fausto@irst.it>
University of Georgia:
ftp -- ai.uga.edu:/pub/ai.reports/
University of Illinois at Urbana:
ftp -- a.cs.uiuc.edu:/pub/dcs
email -- e-amerman@a.cs.uiuc.edu
University of Indiana, Center for Research on Concepts and Cognition:
ftp -- cogsci.indiana.edu:/pub/
email -- helga@cogsci.indiana.edu
University of Kaiserslautern, Germany:
ftp -- ftp.uni-kl.de:/reports_uni-kl/computer_science/
University of Kentucky:
ftp -- ftp.ms.uky.edu:/pub/tech-reports/UK/cs/
University of Massachusetts at Amherst:
email -- techrept@cs.umass.edu
University of Melbourne, Australia,
Computer Vision and Pattern Recognition Laboratory (CVPRL):
ftp -- krang.vis.mu.oz.au:/pub/articles
University of Michigan:
ftp -- ftp.eecs.umich.edu:/techreports
University of North Carolina:
ftp -- ftp.cs.unc.edu:/pub/technical-reports/
University of Pennsylvania:
ftp -- ftp.cis.upenn.edu:/pub/papers/
email -- publications@upenn.edu [email bounced 7/7/93]
USC/Information Sciences Institute:
email -- Sheila Coyazo <scoyazo@isi.edu> is the contact. [email
bounced 7/7/93]
University of Toronto:
ftp -- ftp.cs.toronto.edu:/pub/cogrob/ (Cognitive Robotics)
ftp.cs.toronto.edu:/pub/reports/
email -- tech-reports@cs.toronto.edu
University of Virginia:
ftp -- uvacs.cs.virginia.edu:/pub/techreports/cs
University of Western Australia:
ftp -- ciips.ee.uwa.edu.au
Centre for Intelligent Information Processing Systems (CIIPS)
EE Engineering Department
University of Wisconsin:
ftp -- ftp.cs.wisc.edu:/tech-reports
ftp.cs.wisc.edu:/machine-learning
ftp.cs.wisc.edu:/computer-vision
email -- tech-reports-archive@cs.wisc.edu
Some AI authors have set up repositories of their own papers:
Matthew Ginsberg: t.stanford.edu:/u/ftp/papers
----------------------------------------------------------------
Subject: [7-3] Where can I get a machine readable dictionary, thesaurus, and
other text corpora?
Free:
/usr/dict/words
Roget's 1911 Thesaurus is available by anonymous FTP from the
Consortium for Lexical Research
clr.nmsu.edu:/CLR/lexica/roget-1911 [128.123.1.12]
It is also available from
src.doc.ic.ac.uk:/literary/collections/project_gutenberg/roget11.txt.Z
An old Webster's dictionary is in /text/dict/{DICT.Z,DICT.INDEX.Z}.
Project Gutenberg also has Roget's 1911 Thesaurus. The Project
Gutenberg archive is at mrcnext.cso.uiuc.edu:/pub/etext/. The
Project Gutenberg archive collects public domain electronic books. For more
information, write to Michael S. Hart, Professor of Electronic Text,
Executive Director of Project Gutenberg Etext, Illinois Benedictine
College, 5700 College Road, Lisle, IL 60532 or send email to
hart@vmd.cso.uiuc.edu.
For people without FTP, Austin Code Works sells floppy disks
containing Roget's 1911 Thesaurus for $40.00. This money helps support
the production of other useful texts, such as the 1913 Webster's dictionary.
The Online Book Initiative maintains a text repository on
ftp.std.com (a public access UNIX system, 617-739-WRLD). See the
README file on obi.std.com:/obi/. For more information, send email to
obi@world.std.com, write to Software Tool & Die, 1330 Beacon Street,
Brookline, MA 02146, or call 617-739-0202.
The CHILDES project at Carnegie Mellon University has a lot of data of
children speaking to adults, as well as the adult written and adult
spoken corpora from the CORNELL project. Contact Brian MacWhinney
<brian@andrew.cmu.edu> for more information.
The Association for Computational Linguistics (ACL) has a Data
Collection Initiative. For more information, contact Donald Walker at
Bellcore, walker@flash.bellcore.com.
Two lists of common female first names (4967 names) and male first
names (2924 names) are available for anonymous ftp from
ftp.cs.cmu.edu:/user/ai/areas/nlp/corpora/names/
Read the file README first. Send mail to mkant@cs.cmu.edu for more
information.
A list of 110,000 English words (one per line, in ASCII) is
available in the PD1:<MSDOS.LINGUISTICS> directory on SIMTEL20 as the
files WORDS1.ZIP, WORDS2.ZIP, WORDS3.ZIP, and WORDS4.ZIP. Although the
list is in MS-DOS files, it can easily be used on other machines (but
first you'll have to unzip the files on a DOS machine). The list
includes inflected forms of the words, such as plural nouns and the
-s, -ed, and -ing forms of verbs; thus the number of lexical stems in
the list is considerably smaller than the total number of word forms.
These files are available via FTP from WSMR-SIMTEL20.ARMY.MIL
[192.88.110.20]. SIMTEL20 files are mirrored on wuarchive.wustl.edu.
The Collins English Dictionary encoded as a Prolog fact base is
available from the Oxford Text Archive by anonymous ftp from
ota.ox.ac.uk:/pub/ota/dicts/1192/ [129.67.1.165]
The Oxford Text Archive includes many other texts, dictionaries,
thesauri, word lists, and so on, most of which are available for
scholarly use and research only. See the files
ota.ox.ac.uk:/pub/ota/textarchive.form
ota.ox.ac.uk:/pub/ota/textarchive.info
ota.ox.ac.uk:/pub/ota/textarchive.list
ota.ox.ac.uk:/pub/ota/textarchive.sgml
for more information, or write to archive@ox.ac.uk, Oxford Text Archive,
Oxford University Computing Services, 13 Banbury Road, Oxford OX2
6NN, UK, call 44-865-273238 or fax 44-865-273275.
Chuck Wooters <wooters@icsi.berkeley.edu> has extracted the most
likely pronunciation for each of about 6100 words in the hand-labeled
TIMIT database, and made them available by anonymous ftp from
ftp.icsi.berkeley.edu:/pub/speech/TIMIT.mostlikely.Z.
A list of homophones from general American English is available by
anonymous ftp from svr-ftp.eng.cam.ac.uk:/comp.speech/data/ as the file
homophones-1.01.txt. To receive the list by email, send mail to
Evan.Antworth@sil.org. The list was compiled by Tony Robinson.
Sigurd P. Crossland <sig@seuss.vantage.gte.com> has been compiling
a dictionary of English words, including most common American words,
abbreviations, hyphenations, and even incorrect spellings. The most
recent version is available by anonymous ftp from
wocket.vantage.gte.com:/pub/standard_dictionary/dic-0394.tar.gz
The tar file includes 31 text files, one for each word-length from 2
to 32. The compressed tar file takes up just over 4mb of space, and
includes approximately 870,000 words.
WordNet is an English lexical reference system based on current
psycholinguistic theories of human lexical memory. It organizes nouns,
verbs and adjectives into synonym sets corresponding to lexical
concepts. The sets are linked by a variety of relations. Besides being
of scientific interest,
it makes a handy thesaurus. WordNet is available by anonymous ftp from
clarity.princeton.edu:/pub/
If you retrieve a copy of wordnet by ftp, please send mail to
wordnet@princeton.edu.
Commercial:
Illumind publishes the Moby Thesaurus (25,000 roots/1.2 million
synonyms), Moby Words (560,000 entries), Moby Hyphenator (155,000
entries), and the Moby Part-of-Speech (214,000 entries), Moby
Pronunciator (167,000 entries with IPA encoding, syllabification, and
primary, secondary, and tertiary stress marks) and Moby Language
(100,000 word word lists in five major world languages) lexical
databases. All databases are supplied in pure ASCII, royalty-free, in
both Macintosh and MS-DOS disk formats (also in .Z file formats). Both
commercial (to resell derived structures as part of commercial
applications) and educational/research licenses are available. Samples
of each of the lexical databases are available by anonymous ftp from
netcom.com:/pub/grady/Moby_Sampler.tar.Z [192.100.81.100]. For more
information, write to Illumind, Attn: Grady Ward, 3449 Martha Court,
Arcata, CA 95521, call/fax 707-826-7715, or send email to
grady@netcom.com.
[Maintainer's note: This contact information is no longer valid.
We're working on finding a current address.]
The Oxford Text Archive has hundreds of online texts in a wide variety
of languages, including a few dictionaries (the OED, Collins, etc.).
The Lancaster-Oslo-Bergen (LOB), Brown, and London-Lund corpii are also
available from them. For more information, write to Oxford Electronic
Publishing, Oxford University Press, 200 Madison Avenue, New York, NY
10016, call 212-889-0206, or send mail to archive@vax.oxford.ac.uk.
(Their contact information in England is Oxford Text Archive, Oxford
University Computing Service, 13 Banbury Road, Oxford OX2 6NN, UK, +44
(865) 273238.)
Mailing Lists:
CORPORA is a mailing list for Text Corpora. It welcomes information
and questions about text corpora such as availability, aspects of
compiling and using corpora, software, tagging, parsing, and
bibliography. To be added to the list, send a message to
corpora-request@x400.hd.uib.no. Contributions should be sent to
corpora@x400.hd.uib.no.
Linguistic Data Consortium:
The Linguistic Data Consortium was established to broaden the collection
and distribution of speech and natural language data bases for the
purposes of research and technology development in automatic speech
recognition, natural language processing, and other areas where large
amounts of linguistic data are needed. Information about the LDC is
available by anonymous ftp from ftp.cis.upenn.edu:/pub/ldc [130.91.6.8].
Documents available in this directory include a paper on the background,
rationale and goals of the LDC, a brief list of available data bases,
and some tables summarizing these corpora. For further information,
contact Elizabeth Hodas, <ehodas@walnut.ling.upenn.edu>, Mark Liberman
<myl@unagi.cis.upenn.edu>, or Jack Godfrey <jgodfrey@unagi.cis.upenn.edu>.
----------------------------------------------------------------
Subject: [7-4] List of Smalltalk implementations.
Little Smalltalk -- Tim Budd's version of Smalltalk
cs.orst.edu:/pub/budd/small.v3.tar
GNU Smalltalk
prep.ai.mit.edu:/pub/gnu/smalltalk-1.1.1.tar.Z
----------------------------------------------------------------
Subject: [7-5] AI-related CD-ROMs
Prime Time Freeware for AI:
Prime Time Freeware for AI is an annual CD-ROM collection of
Artificial Intelligence freeware source code and documentation. Prime
Time Freeware for AI in no way modifies the legal restrictions on any
package it includes. Each issue consists of two ISO-9660 CD-ROMs,
bound into a 224 page book.
The current issue (1-1; July 1994) includes a selection of the
contents of the CMU AI Repository (see [5-1]), including most of the
AI Programming Languages section and most of the AI Software Packages
section. Thus the CD-ROMs contain nearly every free implementation of
Lisp, Prolog, Scheme, and Smalltalk, including graphical user
interfaces, object-oriented programming extensions, and other software
development tools.
They also contain the most complete collection of free software in
every area of artificial intelligence research and practice, including
Artificial Life, Expert Systems, Fuzzy Logic, Genetic Algorithms,
Knowledge Representation, Machine Learning, Natural Language
Understanding and Generation, Neural Networks, Planning, Reasoning,
Speech Recognition and Synthesis, and Theorem Proving, and much, much more.
All of the more than 1,300 packages are extensively annotated and
indexed, with programs for searching the index included on the CDs.
Since the CD-ROMs use gzip for compression, this means that Prime
Time Freeware for AI contains more than 5,000 megabytes of
AI-related software.
Prime Time Freeware for AI is targeted at AI researchers, educators,
students, and practitioners. Prime Time Freeware for AI is
particularly useful for programmers who do not have FTP access, but
may also be useful as a way of saving disk space and avoiding annoying
FTP searches and retrievals.
Prime Time Freeware helped establish the CMU AI Repository, and sales
of Prime Time Freeware for AI will continue to help support the
expansion and maintenance of the repository. The product sells (list)
for $60 US plus applicable sales tax and shipping and handling
charges. Payable through Visa, Mastercard, postal money orders in US
funds, and checks in US funds drawn on a US bank. Thus Prime Time
Freeware for AI offers more than twice the contents of the NCC AI
CD-ROM. For more information write to
Prime Time Freeware
370 Altair Way, Suite 150
Sunnyvale, CA 94086 USA
Tel: 408-433-9662
Fax: 408-433-0727
E-mail: ptf@cfcl.com
NCC AI CD-ROM:
The AI CD-ROM Revision 3 is available from Network Cybernetics Corporation
for $89.00 per copy (plus $3 shipping domestic, $8 shipping international).
The AI CD-ROM is an ISO-9660 format disk usable on any computer system, and
contains a variety of public domain, shareware, and other software of
special interest to the AI community. The disk contains source code,
executable programs, demonstration versions of commercial programs,
tutorials and other files for a variety of operating systems. Among the
supported operating systems are DOS, OS/2, Mac, Amiga, and Unix. Among
the items included are the latest version NASA software such as CLIPS v6,
NETS, and SPLICER, the collected source code from AIExpert magazine from
the premier issue in June of 1986 to the present, and complete
transcriptions of the first annual Loebner Prize competition. It also
includes examples many different kinds of neural networks, genetic
algorithms, artificial life simulators, natural language software,
public domain and shareware compilers for a wide range of languages
such as Lisp, Xlisp, Scheme, XScheme, Smalltalk, Prolog, ICON, SNOBOL,
and many others. Complete collections of the Neural Digest, Genetic
Algorithms Digest, and Vision List Digest are included. Most files on
the disk are compressed in ZIP format. Macintosh specific files are
in BinHex v4 (.HQX) format. Network Cybernetics Corporation releases annual
revisions to the AI CD-ROM to keep it up to date with current developments
in the field. For more information, write to Network Cybernetics
Corporation, 4201 Wingren Road, Suite 202, Irving, Texas 75062-2763, call
214-650-2002, fax 214-650-1929, or send email to ai-info@ncc.com.
----------------------------------------------------------------
---
[ comp.ai is moderated. To submit, just post and be patient, or if ]
[ that fails mail your article to <comp-ai@moderators.isc.org>, and ]
[ ask your news administrator to fix the problems with your system. ]