Mobile Barcode Tool

QR Code - scan to visit our mobile site

This is a 2D-barcode containing the address of our mobile site.If your mobile has a barcode reader, simply snap this bar code with the camera and launch the site.

Steps for Importing Static HTML into Wordpress

As previously posted if you have a small web site (i.e. fewer than 20 pages) then manually copying your pages and uploading your images/media might be the best & simplest solution to moving to a Wordpress solution.  If you have more than pages & media files than you care to manually upload then you might want to us a strategy like the following to ease the migration.

A little Wordpress background

  • Every HTML file or media file imported will become a ‘post’ in the database.
  • Each ‘post’ will (or can be) flagged as a ‘page’, a ‘post’ or ‘media.
  • Unique names for each media file are best (based on the default storage approach used by Wordpress if a duplicate name is encountered during import then a number will be appended to the name – best to start out with unique names.)
  • Unique <title> tags for each HTML page are best (again, you avoid conflicts in the import process.)
  • While some HTML markup may be removed it is more likely that it will be retained – when you view the imported page you may encounter formatting issues…

Start with a site structure review of your current web

  • Which STATIC web pages (HTML file) will become Wordpress ‘pages’?
  • Which STATIC web pages (HTML file) will become Wordpress ‘posts’?
  • When importing pages which pages are ‘master’ pages?  (i.e. other pages link back to them OR they are your current ‘index pages’ for areas on your site.)
  • Adopt a naming convention and standardize all of your file names (page and media) – **see below for a list of special characters to avoid in file names.
  • Remove repeated structure from your pages (remove repeated items like: headers, footers & sidebars) [Wordpress will provide new headers, footers, sidebars via your 'theme'.]
  • Remove any ‘excess’ HTML markup OR ‘tweak’ the markup to ‘work’ within Wordpress.

Decide on pure text or limited/custom HTML markup

Edit your static HTML files (i.e. via a stream editor like ’sed’) to remove/replace any troublesome markup.   Look for repeated sections that can be removed, .i.e it was common with older HTML pages to use tables to provide structure – perhaps the top row is your ‘header’ and the bottom row is your ‘footer’ – split the file and remove theses duplicated sections.  You may also want to use a tool like tidy (HTML tidy) to ‘clean’ the HTML markup.

During media import Wordpress will attempt to resolve (remove/replace) the special characters listed below – these characters can cause problems for proper operation of general functions which may not know how to deal with them, hence you should avoid them.  In general you will have fewer file name-related issues if you limit file names to using upper and lower case letters, numbers, dashes, underscores, and periods.  Avoid using special punctuation marks or special symbols (i.e. avoid using any symbol that can only be typed by using the ’shift key’.)

** Special characters to AVOID in file names (this may  vary depending upon your operating system):

  • spaces:  ' '
  • apostrophe, single or double quotes:  ',  "
  • commas:  ,
  • parenthesis:  ( or )
  • pipes:  |
  • exclamation mark:  !
  • other special symbols:  @, #, $, %, ^, &, *. [, ] ,{, }, ~, `

After renaming your files and deciding on your structure you are almost ready to import.  I have not found an import utility that will pull in both media and web page (text) content – so you have to decide:

Option A (semi-auto import - needed for large number of files/media)

  • import media files and export XML (from Wordpress) to provide a cross-reference (this works relatively well if you can import a few hundred files at a time – if you have thousands of media files then it can be a bit cumbersome but it is still do-able and still better than a manual process, single file at a time process…)
  • modify media links in your HTML files (based on the above export – ‘fix’ the links to point to the Wordpress media files) [When you export a Wordpress Site you should have an XML file which contains ‘full paths’ or complete ‘URIs’ to your media resources.  Using this file as a reference you can use a stream editor (like ’sed’) to replace any references in your static HTML files with the new, Wordpress-specific URIs.)

An example – you have a static HTML file with a link to the image file: Some_IMAGE.jpg.  After import into Wordpress, the file is referenced as:

http://YOUR_NEW_DOMAIN/wp-content/uploads/2010/02/Some_IMAGE.jpg

So, prior to importing your static HTML file that uses the above image file you would change the HTML code to point to the new location of the image, i.e. if the file that you want to import contains an image link like:

<a href=”/some_path/Some_IMAGE.jpg>

it needs to be changed to:

<a href=”http://YOU_NEW_DOMAIN/wp-content/uploads/2010/02/Some_IMAGE.jpg”>

Of course, you can manually edit the new Wordpress page after importing but using a process similar to the above your image(s) should be present in your converted pages/posts.

Option B (manual import - fine for a few pages)

  • import media files
  • import HTML files
  • manually edit new ‘posts’ or ‘pages’ and ‘fix’ the missing links to your media

Need help with a similar HTML to CMS, small or large project?  (hourly or project based rates.)


Responding to a comment from Bryce:
Thanks for dropping by… When I first reviewed Wordpress (and other, Open Source publishing solutions) I quickly noted the lack of a complete import solution. There is no simple approach for this type of project – my solution involves using tools that are customized (which automate most of the import) for each such project. I am guessing that the 1500+pg site could probably be broken down by page types – you could categorize them and then create an import process for each type. The typical questions for these types of projects include:

  • how quickly does it need to be completed?
  • do you have adequate budget & resources to meet the timeline?
  • is out-sourcing a solution?

My approach starts with an analysis of such a site (HTML based.) I categorize the pages (i.e. auto-import, manual-import) and estimate the hours needed to complete the import to Wordpress (or other DB based solution) process. In some instances the auto-import pages are sub-categorized (require special handling) so I may wind up with similar, small software-data-import-tools to ease the process.   Good luck with your project, and yes, I would be happy to provide a  review & estimate for your project.  :)

Share and Enjoy:
  • LinkedIn
  • Digg
  • del.icio.us
  • Google Bookmarks
  • Blogosphere News
  • Technorati
  • TwitThis
  • Live
  • Slashdot
  • Sphinn
  • Mixx
  • Yahoo! Buzz
  • StumbleUpon
  • Facebook
  • MSN Reporter
  • Reddit
  • RSS
  • Yahoo! Bookmarks

Related posts:

  1. Migrating Static HTML pages to Wordpress CMS I previously posted about migrating static pages to Drupal –...
  2. Wordpress – importing images (media) I have not found a simple way to batch import ...
  3. Migrating static html pages to a Web CMS/Blog If you have a number of old web sites or...
  4. Email and Web use privacy (html Beacons) Today I received a couple of emails about an old...
  5. Wordpress 3.0 – review & multi-site problems First!  Hat’s Off to the folks working the Wordpress solution...

3 comments to Steps for Importing Static HTML into Wordpress

  • Hello,

    Thank you for your article, it was an interesting read if anything. However, I don’t really see how this is making migrating your static html site any easier into a WordPress website. With all the editing that would be required in the HTML pages before making the migration, it seems it would be just as easy to edit the copy and pasted text from within your post / page editor.

    Right now I’m in the middle of searching for a way to migrate an HTML blog that has over 1500+ pages to a wordpress website and having a hell of a time discovering the best method. I do appreciate the article but don’t think it really gives much ease to the process.

    Kind regards,
    Bryce

  • Interesting post. I have a large site (linked to above) currently running a hand-rolled CMS that I would like to move to WP… someday. this article gives me ideas. But about the media file problem: I think you could write a custom plugin that changes the upload directory: http://wordpress.org/extend/plugins/custom-upload-dir/ — this could let you import things in a directory structure your images are in now?

    or http://wordpress.org/extend/plugins/uploads-folder/ ?

    or http://wordpress.org/extend/plugins/relocate-upload/ ?

    Don’t know if that gets around the problem of re-sorting the media files.

  • Response to Antonio –

    Thanks for dropping by. Wordpress currently creates date-tag-folders for imported media – if you import all media on the same day then everything winds up in one folder. I suppose that you could tinker with ‘the code’ but then you have to re-tinker with each upgrade… :)

    I took a look at your site and based on a quick review, the detail pages should move fairly easily into Wordpress posts. I did not review any media files but I am guessing that they might present the area where more effort is needed. Good luck with your project!

    :)
    Dale


Your GeoIP Data | Ip: 38.107.191.98
Continent: NA | Country Code: US | Country Name: United States
Region: DC | State/Region Name: District of Columbia | City: Washington
(US only) Area Code: 202 | Postal code/Zip: 20007
Latitude: 38.914398 | Longitude: -77.076302
Note - if using a mobile device your physical location may NOT be accurate...