e2interface-perl/0.34

...because you know you want to script e2.

i n t r o

e2interface is a set of Perl modules that make it simple, often trivial, to post data to and retrieve data from everything2.com. If you are writing an e2 client of any sort, if you wish to create a proxy between e2 and any obscure protocol or encumbrance, if you wish to include e2 data in a dynamic website, if you want to run perl -e one-liners from the command line that fetch e2 data and funnel it into one program or another, e2interface might simplify your task profoundly. If you enjoy wrestling with XML parsers and URL encoding and quirks in the system, e2interface will probably ruin all your fun.

Sorry.

e2interface is object-oriented, it's portable, it's simple to use, and it's public domain. And, since code audits are going to be necessary for any widely-used e2 client, it provides a centralized point that handles your sensitive data, that, without user-interface or other extraneous code, can be easily audited, and that, since it is written in an interpreted rather than compiled language, you can be confident is not host to some clever compiled-in trojan.

Okay, kids, that's the pitch, the fast sell, and whether you bought it or not, I think it's time (this being Perl, after all) to get our hands right down in the grease. Well, first I'm gonna go into your standard disclaimer section, which you can probably skip right over.

t h e f i n e p r i n t

Requirements:

expat (http://www.jclark.com/xml/expat.html). expat is packaged with ActiveState's Windows perl distribution; for others, expat can either be built from source or (in most cases) installed from binary packages. On my debian system, installation is a simple apt-get install expat expat1-dev.

E2::Interface

And that, folks, will either print Hello, $user, you are currently logged in. or Hello, Guest User, you are currently NOT logged in., based on whether or not $user and $pass were accepted by the e2 server. Of course we could have checked the return value of login to test for the same thing if we really were into the whole brevity thing.

But anyway, this is E2::Interface, which is the base class to this whole set of modules. It'll log you in, log you out, handle all the funky HTTP/cookie/useragent complexities, manage background threads, clone ... well, we'll get to that heavy stuff later. For now we'll move onto things that are a little more immediately-useful.

....like E2::Node and E2::Ticker, which represent the two branches of e2interface. That is, those modules that load data from the nodegel proper (E2::Node and its descendants), and those modules that load data from special XML Tickers (E2::Ticker and its descendants). All the really useful classes fall into one of these two categories.

(title, author, and createtime correspond to the title, author (creator) and creation time of the node in question. That is, the whole node, as opposed to, say, particular writeups in that node.)

Now, E2::Node deals with "nodes" in the generic sense of the word. That means it deals with e2nodes (those nodes that can contain writeups), writeups, superdocs, users, usergroups, rooms, documents, whatever. Don't confuse E2::Node with E2::E2Node, which deals, in particular, with e2nodes and writeups.

What E2::Node is all about is loading nodes, which it does with the following methods: load, load_by_id, and load_from_xml. The first two fetch a page based on either title or node_id (and, optionally, type—see the documentation for E2::Node below), and then all three parse the document they've received and populate a bunch of internal variables, which we can then retrieve via the access methods.

So is E2::Node just one of those generic superclasses that you never instantiate on its own? Well, not really. I mean it is a generic superclass, but it's also smart enough to determine, while loading a node, just what type of node it's loading. Say, in the above example, $node is aware that it is, itself, an object of the E2::Node class rather than one of its descendants. While loading "Butterfinger McFlurry," it discovers that the node it's loading is an e2node. It then re-blesses itself into the appropriate subclass (in this case, E2::E2Node). This class-change only takes place on instances of E2::Node (if one of its subclasses tries to load a node of an incompatible type, it throws an exception—see below for an explanation of e2interface's exception-handling mechanism).

Besides allowing access to information stored in nodes, certain specific nodetypes have particular actions associated with them (example: e2nodes can be created and can have writeups added to them). These are implemented in the appropriate subclass of E2::Node.

E2::Ticker is basically a collection of subroutines that fetch tickers, parse them, and return lists of hashrefs that represent the values listed in those tickers. Occasionally, tickers pass values not associated with any member of the list (example: the Time Since ticker returns the current time, which, after a call to time_since to fetch a list of values, can be retrieved with a call to time_since_now). Those tickers that require a more complex interface are implemented as subclasses to E2::Ticker.

Exceptions

Error-handling in e2interface is implemented using Perl's exception mechanism, eval{ } and Carp::croak. For anything more serious that a boolean "this method was unsuccessful" return, an exception will be thrown. The main exceptions are the following:

'Usage:'

Usage error, method called with improper parameters. (e.g. a bug in the caller's code). A description of proper usage of that message follows after the colon.

'Unable to process request'

HTTP communication error.

'Invalid document'

Invalid document received.

'Parse error:'

Exception raised by XML parser. The error XML::Twig generated will follow after the colon.

'Wrong node type:'

Attempted to load an incompatible node type (trying to load an e2node with E2::User, for example, would generate one of these). The (invalid) type is listed after the colon. NOTE: This exception will only be thrown by E2::Node and its descendants.

Each exception is returned as a string. They can be tested for with regexps, for example:

/^Parse error: (.*)/

The exceptions that have names ending in colons have more specific data following that colon. Also, all exceptions are generated by calls to Carp::croak( "..." ), so they contain line number information at the end of the string. In other words, when testing for exceptions, always test the beginning of the string.

Threading

Network access is slow. Methods that rely upon network access may hold control of your program for a number of seconds, perhaps even minutes. In an interactive program, this sort of wait may be unacceptable.

(If this section seems confusing, you can safely ignore it. It was added to the library to address a specific shortcoming in perl (the kludge required to share objects across threads), and if you don't need threading, you really don't need to know any more from this section.)

e2interface supports a limited form of multithreading (in versions of perl that support ithreads--i.e. 5.8.0 and later) that allows network-dependant members to be called in the background and their return values to be retrieved later on. This is enabled by calling use_threads on an instance of any class derived from E2::Interface (threading is cloned, so use_threads affects all instances of e2interface classes that have been cloned from one-another). After enabling threading, any method that relies on network access will return (-1, job_id) and be executed in the background.

This job_id can then be passed to finish to retrieve the return value of the method. If, in the call to finish, the method has not yet completed, it returns (-1, job_id). If the method has completed, finish returns a list consisting of the job_id followed by the return value of the method.

A code reference can be also be attached to a background method, using the thread_then method.

use E2::Message;
my $catbox = new E2::Message;
$catbox->use_threads;
# Execute $catbox->list_public in the background
$catbox->thread_then( [ \&E2::Message::list_public, $catbox ],
# This subroutine will be called when list_public# finishes, and will be passed its return value in @_
sub {
foreach( @_ ) {
print $_->{text};
}
# If we were to return something here, it could# be retrieved in the call to finish() below.
}
);
# Do stuff here.....# Discard the return value of the deferred method (this will be# the point where the above anonymous subroutine actually# gets executed, during a call to finish())
while( $node->finish ) {} # Finish will not return a false# value until all deferred methods# have completed

t h e m o d u l e s

This document, this writeup, is intended as a supplement to the more extensive documentation in the modules themselves, which should be accessible in man pages titled according to the package names (example: man E2::Interface). Their HTML equivalents are available at http://joseweeks.com/e2interface/.

What follows is a list of all the modules in e2interface and all of their public methods, and then a couple example scripts to show the package in action.

Method list

attempts to login to Everything2.com with the specified USERNAME and PASSWORD.

verify_login

can be called after setting cookie to verify the login.

logout

attempts to log the user out of Everything2.com.

process_request HASH

requests the specified page via HTTP and returns its text.

clone OBJECT

copies various members from the E2::Interface-derived object OBJECT to this object so that both objects will use the same agent to process requests to Everything2.com.

debug [ LEVEL ]

sets the debug level of e2interface.

client_name

return the name of this client, "e2interface-perl".

version

returns the version number of this client.

this_username

returns the username currently being used by this agent.

this_user_id

returns the user_id of the current user.

domain [ DOMAIN ]

returns, and (if DOMAIN is specified) sets the domain used to fetch pages from e2.

cookie [ COOKIE ]

returns the current everything2.com cookie (used to maintain login).

agentstring [ AGENTSTRING ]

returns and optionally sets the value prependend to e2interface's agentstring, which is then used in HTTP requests.

document

returns the text of the last document retrieved by this instance in a call to process_request.

logged_in

returns a boolean value, true if the user is logged in and undef if not.

use_threads [ NUMBER ]

creates a background thread (or NUMBER background threads) to be used to execute network-dependant methods.

join_threads

detach_threads

These methods disable e2interface's threading for an instance or a set of cloned instances.

finish [ JOB_ID ]

handles all post-processing of deferred methods, and returns the final return value of the deferred method.

thread_then METHOD, CODE [, FINAL ]

executes METHOD (which is a reference to an array that consists of a method and its parameters, e.g.: [ \&E2::Node::load, $e2, $title, $type ]), and sets up CODE (a code reference) to be passed the return value of METHOD when METHOD completes.

E2::Node:

title

node_id

author

author_id

createtime

type

These methods return, respectively, the title of the node, the node_id, the author, the user_id of the author, the createtime (in the format "YYYY-MM-DD HH:MM:SS"), or the type, of the current node. They return undef if there is no node currently loaded.

exists

Boolean: "Does this node exist?"

load TITLE [, TYPE ] [, SOFTLINK ]

load_by_id NODE_ID [, SOFTLINK ]

load_from_xml XML_STRING

These methods load a node based upon, respectively, TITLE, NODE_ID, or XML_STRING. They populate a number of internal variables, which are accessable through the access methods listed above.

autodetect

is used enable nodetype autodetection on an object that would normally not allow it.

E2::E2Node:

clear

clears all the information currently stored in $node.

has_mine

is_locked

Boolean: "Does this node have a writeup by me in it?"; "Is this node softlocked?"

list_softlinks

list_firmlinks

list_sametitles

These methods return a list of softlinks, firmlinks, or sametitles.

list_writeups

returns a list of E2::Writeups corresponding to the writeups in the currently-loaded node. It returns an empty list if this node contains no writeups, and undef if there is no node currently loaded.

get_writeup [ NUM ]

get_writeup_by_author AUTHOR

get_my_writeup

These methods return references to E2::Writeup objects. get_writeup returns the NUM'th writeup in the current node (or, if NUM is not specified, the writeup immediately succeeding the last writeup returned by get_writeup). get_writeup_by_author returns the writeup in the current node that was written by AUTHOR. get_my_writeup returns the writeup in the current node written by the currently-logged-in user. See the E2::Writeup manpage for information about accessing writeup data.

get_writeup_count

returns the number of writeups in the current node. Returns undef if there is no node currently loaded.

get_writeup_number

returns the number of the next writeup that get_writeup will, by default, return. Returns undef if there is no node currently loaded.

vote NODE_ID => VOTE [ , NODE_ID2 => VOTE2 [ , ... ] ]

votes on a list of writeups. There should be a NODE_ID => VOTE pair for each writeup to vote upon. NODE_ID is the node_id of the writeup, and VOTE is either -1 or 1, (downvote or upvote, respectively).

add_writeup TEXT, TYPE [ , NODISPLAY ]

adds a new writeup to the current node. TEXT is the text of the writeup, TYPE is the type of writeup it is (one of: "person", "place", "thing", or "idea"), and NODISPLAY, if true (it defaults to false), tells E2 not to display this writeup in "New Writeups". It returns true on success and undef on failure.

create TITLE

creates a new node (a "nodeshell") of title TITLE, then loads this new node.

E2::Writeup:

clear

clears all the information currently stored in $writeup. It returns true.

wrtype

parent

parent_id

marked

cool_count

text

These methods return, respectively, the writeup's type, its parent's title, its parent's node_id, its "marked for destruction" status (boolean: is it marked for destruction?), the number of C!s it has received, and the text of the writeup.

cools

returns a list of the users who've cooled this writeup.

rep

returns a hashref concerning the reputation of this writeup.

cool [ NODE_ID ]

attempts to cool (C!) a writeup. If NODE_ID is specified, it attempts to cool the writeup with that id, otherwise it attempts to cool the currently-loaded writeup.

vote -1 | 1

attempts to vote on this writeup (-1 for a downvote, 1 for an upvote).

reply TEXT [, CC ]

sends a "blab" message reply to the author of the currently-loaded writeup. If CC is true, it sends a copy of the message to you, the sender.

update TEXT [ , TYPE ]

updates the currently-loaded writeup. TYPE, which defaults to the type the writeup was prior to the update, is the type of writeup this is (one of: "person", "place", "thing", or "idea"). During the update, the writeup is re-loaded, so any changes should be immediately visible in this object.

E2::User:

name

id

lasttime

These return, respectively, the username, user_id, the time of account creation, and the last time the specified user was seen on E2.

alias

alias_id

These return, respectively, the username and user_id of this user's alias (for message forwarding) if he indeed has one.

text

returns the text of the "User Bio" section of the user's homenode.

experience

level

level_string

These return, respectively, the XP number of the user in question, the level number, and the level including description text ("13 (Pseudo God)", etc.).

writeup_count

cool_count

These return, respectively, the number of writeups written by the user in question, and the number of C!s he has spent.

image_url

returns a relative URL to the homenode image of the user in question.

lastnode

lastnode_id

These return the name and id, respectively, of the most recent node written by this user.

mission

specialties

motto

employment

These return the strings that are displayed in the user's homenode regarding his mission drive, specialties, motto, and his employer or school.

groups

returns a list of hashrefs corresponding to the groups which this user is a member. It only lists membership in 'gods', 'Content Editors', and 'edev'. Hash keys include 'title' and 'id'.

bookmarks

returns a list of hashrefs corresponding to the nodes that this user has bookmarked. Hash keys include 'title' and 'id'.

E2::Superdoc:

clear

clears all the information currently stored in $superdoc.

text

returns the superdoc text of the currently-loaded superdoc.

E2::Room:

clear

clears all the information currently stored in $room.

description

returns the description string of the currently-loaded room. It returns undef if no usergroup is loaded.

can_enter

returns a boolean value: whether or not the currently-logged-in user can enter this room.

E2::Usergroup:

clear

clears all the information currently stored in $group.

description

returns the description string of the currently-loaded usergroup. It returns undef if no usergroup is loaded.

list_members

returns a list of hashrefs corresponding to each member of the currently-loaded usergroup. It returns an empty list if the usergroup has no members, and undef if no usergroup is loaded.

list_weblog

returns a list of hashrefs corresponding to each item in the currently-loaded usergroup's weblog.

E2::Ticker:

new_writeups [ COUNT ]

fetches the New Writeups ticker from everything2 and returns a list of hashrefs (sorted reverse-chronologically). If COUNT is specified, it returns "COUNT" values, otherwise it returns the server's default count.

other_users [ ROOM_ID ]

fetches the Other Users ticker from everything2 and returns a list of hashrefs (sorted by descending XP). If ROOM_ID is specified, only users in the specified room are listed.

random_nodes

fetches the Random Nodes ticker from everything2 and returns a list of hashrefs.

cool_nodes [ WRITTEN_BY ] [, COOLED_BY ] [, COUNT ] [, OFFSET ]

fetches the Cool Nodes ticker from everything2 and returns a list of hashrefs (sorted reverse-chronologically). Results can be filtered by "WRITTEN_BY" and "COOLED_BY", which should be usernames. If COUNT is specified, this method returns "COUNT" values. COUNT has a server default of 50, and a max of 50 as well. OFFSET specifies how many values back to start in the list, and is used for paging through Cool Nodes.

editor_cools

fetches the Editor Cools (or "Endorsements") ticker from everything2 and returns a list of hashrefs (sorted reverse-chronologically). If COUNT is specified, it returns "COUNT" values, otherwise it returns the server's default count.

time_since [ USER_LIST ]

fetches the Time Since ticker and returns a list of values. If USER_LIST is not specified, it returns a list with one value, that corresponding to the currently-logged-in user.

available_rooms

returns a list of available rooms. The first item in this list is the "go outside" superdoc.

best_users [ NOGODS ]

returns a list of Everything2's Best Users. If NOGODS (boolean) is specified, site admins are not included in the listing.

node_heaven [ NODE_ID ]

returns a list of the currently-logged-in user's node heaven (deleted writeups). If NODE_ID is specified, it returns a list with a single element, the deleted writeup corresponding to that NODE_ID. If the specified NODE_ID is not a deleted writeup, or if the user has no deleted writeups, this method returns an empty list.

maintenance_nodes

returns a list of maintenance nodes (example: "E2 Nuke Request").

raw_vars

returns a hashref to the current user's "raw vars" hash on E2. It consists of a number of key/value pairs.

load_interfaces

loads the site-independant list of ticker nodes. E2::Ticker holds its own default list, but extremely paranoid clients can call load_interface to make sure it's using the up-to-date list of ticker interfaces.

interfaces

returns the list of xml interfaces used to load xml tickers. It returns a hashref with keys corresponding to the names of the interfaces and values corresponding to the node title of the corresponding ticker.

random_nodes_wit

returns the "random wit" that was fetched by the last call to random_nodes. Returns undef if none have been fetched.

time_since_now

returns the "now" value returned by the last call to time_since. Returns undef if that method has not been called.

use_string STRING

can be used to load a ticker from an XML string rather than the everything2.com server. It's used internally for debugging the tickers, and can be used to cache ticker pages (see E2::Interface::document).

E2::Message:

topic

returns the current public room's topic. This topic is updated as a side-effect to both list_public and set_room, so if neither of these methods have been called, topic will return undef.

room

returns the current room name.

room_id

returns the current room's node_id.

list_public

fetches and returns any public messages in the current room that have been posted since the last call to list_public, as well as updating the topic, room, and room_id.

list_private [ DROP_ARCHIVED ]

fetches and returns any private messages that have been posted sincethe last call to list_private. If DROP_ARCHIVED is true, only messages that do not have the 'archive' flag will be returned.

reset_public

resets the public message ticker, so the next call to list_public will retrieve all available public messages (they will all be considered "unfetched").

reset_private

resets the private message ticker, so that in the next call to list_private, all private messages will be considered "unfetched."

send MESSAGE_TEXT

sends "TEXT" as if it were entered into E2's chatterbox. This message need not be escaped in any way. It returns true on success and undef on failure.

blab RECIPIANT_ID, MESSAGE_TEXT [, CC ]

sends the private "blab" message MESSAGE_TEXT to user_id RECIPIANT_ID. If CC is true, sends a copy of the message to the sender as well. Returns true on success and undef on failure.

archive MSG_ID_LIST

unarchive MSG_ID_LIST

delete MSG_ID_LIST

These methods archive, unarchive, or permanently delete the messages in MSG_ID_LIST (this is a list of message ids).

perform HASH

performs multiple archive, unarchive, and delete operations on a list of messages.

set_room ROOM_NAME

changes the current public room to ROOM_NAME. It returns true on success, 0 if ROOM_NAME is already the current room, and undef on failure.

E2::Search:

$search->search KEYWORDS [, NODETYPE ] [, MAX_RESULTS ]

performs a title search and returns a list of hashrefs to the titles found (with "title" and "node_id" as keys to each hash). NODETYPE is the type of node intended ("e2node" is default; other possibilities include "user", "group", "room", "document", "superdoc", and possible others). MAX_RESULTS (if set) is the maximum number of results to return.

E2::UserSearch:

writeups [ USERNAME ] [, SORT_BY ] [, COUNT ] [, START_AT ]

does a "writeups by user" search on the user (USERNAME defaults the username of the currently-logged-in user; if no user is logged in, USERNAME must be specified or a "No username specified" error is thrown) for COUNT number of writeups (defualt is 50), starting at START_AT (which is an offset from the highest writeup as ranked by SORT_BY--more on that later), which defaults to 0. If -1 is passed as the COUNT, this method will fetch ALL writeups by the specified user. For many users, this would be a pretty big hit on the database. The suggested method is to space calls to writeups over a period of time, perhaps only displaying a page at a time/etc. When you receive less writeups than you asked for, you'll have hit the final page of the writeups search.

sort_results [ SORT_BY ] [, COUNT ] [, START_AT ]

sorts and returns a list of writeups (E2::Writeups) fetched from e2 by writeups. COUNT is the maximum number of writeups to fetch (-1 for ALL, which is the default), START_AT is an offset from the highest ranked writeup (ranked by SORT_BY), which defaults to 0.

compare OLD_USER_SEARCH

compares this E2::UserSearch with another, returning a list of hashrefs corresponding to each writeup that differs between the two.

stats

returns statistical information about this usersearch. This is loaded by calling compare.

E2::Session:

clear

clears all stored session values.

update

fetches the personal session from e2 and makes available all of the access methods below. If a user is not logged in, the only session information fetched will be the servertime (retrievable via time) and this user's username and user_id (retrievable via this_username and this_user_id, which are inherited from E2::Interface).

votes

cools

experience

writeups

time

These methods return the user's number of votes left today, number of cools left today, their current experience number, their current number of writeups, and the current server time. Example server time:
"Sun Mar 16 15:58:20 2003".

borged

forbidden

These methods return values corresponding, respectively, to whether the current user has been borged and whether the current user has been forbidden to post writeups. Both return boolean values, but forbidden, if true, is a text string describing the lock.

xpchange

returns the user's change in experience since that previous time he updated his user session (or loaded an epicenter nodelet). It is only defined if either the user's experience number or writeup count has changed since the previous update.

nextlevel

returns information about requirements the user must meet to reach the next level. It is only defined if either the user's experience number or writeup count has changed since the previous update (either by a call to update or by loading the epicenter nodelet).

E2::ClientVersion:

update

fetches the list of registered clients from e2.

clients

returns a hashref to the information about registered clients on e2.

E2::Scratchpad:

load [ USER_ID ]

fetches a user's scratchpad.

update [ TEXT ] [, SHARE ]

If TEXT is specified, this method updates the text of the currently-logged-in user's scratchpad. If SHARE is specified, it tells the server whether this scratchpad is to be publicly shared or not.

shared

user

text

These methods return, respectively, the boolean: "Is this scratchpad publicly shared?"; the username of the user to whom this scratchpad belongs, and the text of this scratchpad.