Archive for April, 2011

Today we’re announcing the launch of the CloudPreservation Public API. It’s designed to make it even easier for you to get web-accessible data into CloudPreservation. What’s particularly great about this API is how easy it makes taking fine-grained control over your web preservations, either through a handy browser-based bookmarklet tool, or by having your own development team programmatically add webpages to your feeds. Let’s take a look at two ways to use this new feature.

Keeping a website feed up to date

Say you run a website that provides lots of content and has lots of updates — naturally, you have a CloudPreservation instance pointing at it to track all the changes. To keep tabs on everything that’s happening on your site, you’ve probably got the crawl frequency of that CP instance cranked up as fast as it goes too.

Unfortunately, that’s pretty inefficient. Not all of your pages have necessarily changed over the course of a week, so hunting through them all for the changes just takes time that doesn’t necessarily need to be spent. And, if you make an important change on Monday, it won’t be preserved until the next crawl — which might not be until the following Sunday.

The CloudPreservation Public API makes it really easy to get these “between crawl” changes into your feed as they happen. Simply install the bookmarklet for your website feed by dragging it to your bookmarks bar (or right clicking and choosing “Add Favorite” if you use Internet Explorer).

Then on every page you’d like to add or update, just click the bookmarklet and we’ll go fetch the newest version of it.

It really is that simple.

If you have a small development team available to you, you could even go one step further and integrate your content management system with our API. This would let you preserve copies of new or updated pages in CloudPreservation as they’re published or edited — nearly in real-time.

Storing only the pages you want to store

This new ability to fetch and preserve single pages actually lends itself to having a new type of feed as well, which we’re calling a Public API/Bookmarklet Feed. This feed only takes in the pages you tell it to specifically through the API or the bookmarklet.

Let’s say your company just launched an amazing new feature that’s being covered by all the major news outlets. The world is buzzing about your product and you want to preserve what they’re saying. Simply set up a Public API/Bookmarklet Feed — in this case we’ll call it “Launch Buzz” — and install its bookmarklet. Then, browse to any of the articles that you want to preserve and click the bookmarklet. CloudPreservation will see your request, and preserve a copy of that page in the “Launch Buzz” feed.

Public API/Bookmarklet feeds are fantastic for preserving this sort of research, as well as any other time you want to keep track of a collection of very specific web pages without crawling and storing their entire website. Collecting and preserving single webpages has never been easier.

More information

Bookmarklets are available today for all webpage and Public API/Bookmarklet feeds, look for them on your feeds listing page as well as instructions to get you started.

For the more programmatically inclined, the public API — and its associated documentation — is also available for use starting today. The documentation contains examples on how to send us webpages in various programming languages as well as instructions on how to move beyond those examples to build your own custom solutions.

We think the API is going to be a great tool in your preservation arsenal. As always, we love hearing your feedback. Feel free to get in touch with us if you have any comments or questions.

As most of you know, Nextpoint is a proponent of cloud computing and a customer of Amazon Web Services (AWS). We’ve been thrilled with our relationship with Amazon in large part because of the unprecedented uptime experienced to date.So with the very public outage that AWS underwent yesterday, it’s important for us to share how this outage impacted Nextpoint products and why. Transparency builds trust and it’s at the core of our values.All Nextpoint products including Cloud Preservation, Discovery Cloud, and Trial Cloud experienced 100% uptime yesterday. There was no point in time where data was unavailable to users. That’s correct. Our service remained accessible to all users for data that was already in these products.Unfortunately, we did experience marginally delayed processing times for imports and exports. We take these delays very seriously and know the importance and potential impact to our customers. And we worked hard yesterday to communicate as quickly as we could with customers who reported delays.

How were we able to avoid downtime during the AWS outage? By preparing for it. Cloud Computing is an essential tool in our architecture. It allows for increased scalability in an unprecedented fashion. But cloud computing is just one of many components of Nextpoint’s product architecture. An important component, but far from the only component. We utilize a hybrid architecture that includes cloud computing and traditional co-located servers to minimize service interruptions.

This means a bigger research and development investment but that’s our commitment to our customers. We apologize for any service disruption you experienced, and as always, if you have any questions or concerns please let us know.

In an effort to make importing into the Nextpoint Trial Cloud and Discovery Cloud as easy as possible, we’ve updated our documentation to provide detailed instruction on both the batch import process and conversion of Concordance exports.

Nextpoint Batch Import Specification details the meta-data load file format, the zip file structure, uploading, and other tips on how to successfully import batches of documents into Trial Cloud and Discovery Cloud.

CloudPreservation provides a great way to preserve not only your website, but all of your social media (Twitter, Facebook Pages & Profiles, etc). But what happens when one person needs access to the NextpointLab Twitter feed, but shouldn’t be able to comb through the Nextpoint Twitter feed?

To that end, we’ve introduced Feed Permissions.

Select the users that should/shouldn’t have access.

Selected users will be able to see the feed in question while others will not*, providing users with the ability to execute a single search across all feeds that they have access to, while limiting the visibility where appropriate.

* Advanced-level users have access to all feeds and cannot be limited. They are also the only users allowed to manage permission settings.

This week we released a whole bunch of great new features for our Twitter feeds on CloudPreservation.com. We’re really excited about this new functionality, as it really enhances CloudPreservation‘s collection abilities for Twitter. Here’s a summary of a couple of these great new features.

This great new feature allows us to collect the Twitter stream as it happens, so we won’t miss a thing. All new tweets, retweets, direct messages and deleted tweets will be collected as soon as they happen.

When creating a new Twitter feed, just check Enable real-time monitoring of this feed and you’ll be collecting the Twitter stream real-time.

Enable Real-Time Monitoring (click to enlarge)

Once you’ve setup your feed, you’ll see that it’s setup for real-time updates on the Twitter feed details:

Feed Details Showing Real-Time Monitoring (click to enlarge)

Archive Direct Messages and Protected Twitter Accounts

As part of this release, we implemented Twitter authentication into our new feed setup for Twitter. This allows us to collect much more information from Twitter accounts that provide CloudPreservation.com this access.

Twitter Authentication (click to enlarge)

With this authentication, we can now collect direct messages received and direct messages sent. Additionally this authentication allows us to collect all this information from protected Twitter accounts (those accounts that only choose to share with friends).

These new Twitter features are a great addition to our already ample (and growing) list of features within CloudPreservation.com.

You’ve uploaded all of your docs into Discovery Cloud and, after some searching for agreed upon keywords, have broken up the large universe of documents into several smaller sets (“subreviews”), but… these “smaller” sets still number in the high thousands for docs contained! You need to go further – you need to break these sets again so each of your reviewers has a manageable set of documents that they can reasonably attack.

The “Split into subreviews” option on the landing page of your Review provides just that.

With a click, you’ll be on the road to breaking up that large Subreview into several smaller pieces. You can break it up into as many pieces as you’d like by simply providing their names. You may choose to name them things like “Environment-Bob” and “Environment-Sarah”, but you can get as original/specific as you like.

You control what happens to the original set of documents (“Environment” in this example). Keep it around to maintain a rolled-up view of what’s going on in the component subreviews, or remove it to reduce clutter. You also control what (if any) additional documents should be pulled into the set. For example, you could pull emails related to your documents into the overall document set, to ensure that they’re included.

Related documents will be placed together (and sequentially) into created subreviews to provide continuity for the reader. For example, you won’t have to worry about an email landing in a different subreview than it’s attachments. This may lead to slightly uneven document counts in subreviews, but only in extreme circumstances will the overall document counts be wildly different.

After/while you’re documents are being split up, you can visit the “Settings” section to assign the subreviews to the specific reviewers. If the reviewer is in the Nextpoint “Reviewer” role-type, they will only have visibility to subreviews they are assigned. If the reviewer is of a different role-type (“Advanced” or “Standard”), assignment will provide some clarify as to who is working on what.

The ability to easily breakup and assign large subreviews will provide clarity and visibility to the higher level task, helping you to get the job done not only faster, but better.

Up until now, Facebook crawl results have been difficult to view as they relate to the “Wall” of a user or fan page. So, last week we rolled out some enhancements that make it easier for you to view the Facebook conversation stream in a reverse chronological order.

First, we changed the title of all Facebook items in the feed to include the date, the type of post (status, link, photo, etc.), and an excerpt of the title or message of the post (if available). This should make scanning the list of Facebook items much easier.

Second, we made sure that the results are in a “newest first” order, so that you can see the conversation stream organized similar to how it is available on the “Wall” of the person or organization. As new comments and likes are found in subsequent crawls, the date will change and these posts will be pushed to the top of the list.

Below is a screenshot of the updated results of the Facebook page:

New Facebook Listing (Click to Enlarge)

We think these changes are really going to improve the experience of navigating your comprehensive archive of Facebook profiles and fan pages.