HTML Import 2

Omschrijving

Imports well-formed static HTML files into WordPress. Requires PHP 5.

This plugin will import a directory of files as either pages or posts. You may specify the HTML tag (e.g. <body>, <div id="content">, or <td width="732">) or Dreamweaver template region (e.g. ‘Main Content’) containing the content you want to import.

If importing pages, the directory hierarchy will be preserved. Directories containing the specified file types will be imported as empty parent pages (or, if an index file is present, its contents will be used for the parent page). Directories that do not contain the specified file types will be ignored.

As files are imported, the resulting IDs, permalinks, and titles will be displayed. On completion, the importer will provide a list of Apache redirects that can be used in your .htaccess file to seamlessly transfer visitors from the old file locations to the new WordPress permalinks. As of 2.0, if you change your permalink structure after you’ve imported your files, you can regenerate the redirects—the file’s old URL is stored as a custom field in the imported post.

Options:

import files into any post type (posts, pages, or custom post types set to public)

The same site, after the import (directory hierarchy preserved as parent/child pages)

Installatie

Unzip the files and upload the plugin directory to /wp-content/plugins/

Activeer de plugin via het ‘Plugins’ menu in WordPress

Go to Settings → HTML Import to begin. You must save the settings before proceeding to Tools → Import → HTML.

FAQ

Installation Instructions

Unzip the files and upload the plugin directory to /wp-content/plugins/

Activeer de plugin via het ‘Plugins’ menu in WordPress

Go to Settings → HTML Import to begin. You must save the settings before proceeding to Tools → Import → HTML.

My title imported, but the content was empty! (Or vice versa.)

You didn’t find the right HTML tag that surrounds the content you wanted to import. Open up one of your old files in a browser and use its inspector (or Firebug) to select the content you want. Look for the tag that surrounds that content and find something unique about it. (An ID attribute is best, but anything unique will work. If it’s a table cell, a unique width will do just fine.) The enter the tag name, the attribute name, and the attribute’s value into the separate boxes in the Content section of the importer’s options page.

No. The importer simply extracts the relevant part of each HTML file and copies it into a WordPress post. You’ll need to create a custom theme if you want to preserve the site’s appearance as well as its content.

Will this work on large numbers of HTML files?

Yes, it has been used to import over a thousand pages, and did so in a couple of minutes. However, you might need to adjust PHP’s max_execution_time setting as described below.

I import a few files and then the script times out. What can I do?

The importer will attempt to work around your server’s max_execution_time setting for PHP (usually 30 seconds), but some servers don’t allow this. You can try to increase it by adding a line to your .htaccess file:

php_value max_execution_time 160

If that gets you further but still doesn’t finish, just increase the number (it’s in seconds). However, note that your host might get irritated with you for hogging the server’s resources. If you have a lot of files to import, it’s best to install WordPress on your desktop (XAMPP for Windows and MAMP for Macs make it pretty easy) and run the importer there instead of doing it on your live server.

It’s also quite possible that the script is trying to use more memory than your server allows. You can try to change that setting, too, in .htaccess:

php_value memory_limit 1024M

Should I remove ‘images’ from the list of skipped directories if I want to import images?

The skipped directory setting just tells the importer where to look for HTML files. Linked images will be imported no matter where they’re located.

Can I import files from another server?

No. The files must be on the same server as your WordPress installation. I have no intention of ever making this plugin import files from URLs. You are welcome to fork the code if you want to add this feature.

Beoordelingen

We have a wiki from Wikispaces consisting of 880+ pages and needed to find a new format to host it in without cutting and pasting all of those pages somewhere. A quick HTML output from that site, some mass search and replaces of URL in the HTML files, and an import using this and we had a functioning site again! I hope Stephanie Leary can consider continuing work on this plugin as it has a very specific purpose that can help some of us under certain conditions. It was a great plugin to find and immensely helpful.

This is awesome, just what I needed.
I just got one issue, surely caused by some wrong setting. Each time I import a HTML page int WP, I get all my existing pages copied too (so that I have multiple pages duplicated)
Please help wth this.
Thanks in advance

After spending over a week with in-house developers, web hosting advanced customer service, added php memory, purchased a whole new site for clean install, emails to the developer, posting o boards, etc, we have not been able to import even one page. Even the hosting company worked on it personally. We have attempted a direct full import, small directory files, etc in an attempt to overcome the fatal errors to no avail. Maybe this is a main reason there has been no update to the plugin in several months. Point of interest, we have also attempted the import conversion ontwo different WP releases. Poor execution.

2.5

Fixed some incorrectly escaped options that would trigger translations on things that shouldn’t be translated

Page template selections are now pre-selected when returning to the options page (props Lee Fent)

2.4

You can now specify more than one index filename (e.g. ‘index.php, default.htm’)

New option to remove the imported title from within the content area

Fallbacks: if your chosen tag/area is empty or does not exist, the importer will select <body> for content and <title> for the title. As a last resort, if there is no title, the original file name will become the title.

You can now use a custom field named ‘post_tag’ to import tags from a portion of the file

2.2

Now imports media files other than images. Uses rawurldecode() to remove junk like %20 from file names, and thus should now handle situations where your link is something like my%20file.doc and your file is actually called my file.doc.

Now handles images with https srcs.

Removed a pointless security check that was preventing people from uploading valid image files.

2.1

New option to fix internal links. Also, the importer now bakes you cookies. (Kidding about the cookies.) (August 23, 2011)

2.0.2

Added some helpers to work around servers that do not support PHP’s multibyte string functions. (August 12, 2011)

2.0.1

2.0

New option to import images linked in the imported HTML files. It can handle most relative paths as well as absolute URLs. The report includes a list of the image paths that couldn’t be found.

Now supports all public custom post types and taxonomies (including hierarchical ones).

Completely different, much better handling of special characters.

The import screen now lets you upload a single file.

New user interface. The options form is now broken up into several tabbed sections. Categories and other hierarchical taxonomies are selected with checkboxes.

The options form is now separate from the importer. It will now check your settings before the importer runs — for example, you’ll get a warning if your beginning directory isn’t readable.

The importer itself is now based on the WordPress import class, which means it looks and works more like other importers. It is located under Tools→Import (but you should visit the settings screen first).

Files’ old URLs are now stored as custom fields in the imported posts. There’s now an option to regenerate the redirects for your imported files, which is handy if you changed your permalink structure after you finished importing.