Search

Recent Posts

Tags


« | Main | »

Drupal 4.x migration to Drupal 6.x

By Dale Reagan | February 4, 2010

Unfortunately there is no direct path (I did not find one) to migrate a Drupal 4.x site to Drupal 6.x (current version) – note that most likely you would face the same hurdles with any other CMS where you are making large version ‘jumps’.  Note that I previously explored Drupal and I am not an active user of the solution – I am impressed by many of it’s features, options, and many, many fine web sites.

WARNING – very long post due to examples and level of detail…

The challenges are not limited to moving/migrating database contents – you also have to review:

NOTE – by design, Drupal  connects the formatting & display of the data (what the contents looks like, where it is located on the page, and pulling data out of your site for a visitor to see);  IMO this is not a desirable feature – I would prefer that a CMS solution separate all presentation such that a site can function regardless of the theme being used.  As I understand it (after having explored both documentation and data)  Drupal content is displayed in blocks which are placed in regions of a web page; access/display of regions is controlled by themes – if your theme of choice does not work with your block type or does not ‘activate’ the block type then you won’t have access to that type of block content…  You will have to modify your theme in order to display any desired data – my preference would be to simply ‘activate’ my content and have a theme that simply manages the look & feel of the page.  [If I am mistaken or if this changes then please let me know and I will update this information…]

A) If you review the migration documentation for your CMS most likely it will include something like:

  1. make incremental moves in the ‘core’ of your CMS from your existing through all versions until you reach the desired version (i.e. version 4.1 to 5.x to 6.x)
  2. while doing the above, migrate all of your other components (perhaps incrementally, perhaps not – i.e.  version 1.0 of module X to versions 1.5, 2.9… x.x)
  3. remove/move/relink any database data ‘lost’ or ‘orphaned’ during the above process

B) Another approach might be:

  1. install latest & greatest version of desired CMS
  2. install desired modules, themes, etc. for this version of the CMS
  3. migrate your data (custom code a data transition from 4.x to 6.x)

How to avoid this type of problem?

IMO – you can’t.  Things change in dynamic environments (i.e. on the Internet.)  You either adapt, migrate, update or simply stay where you are.  What is your reason for migrating/updating?  (What’s the goal/benefit?)

C) I need this FAST!

  1. use a hybrid solution where you start with the newest/latest version of your CMS of choice (which offers that badly needed feature that prompted the move…)
  2. for as long as needed, maintain the old ‘site’ (or links, data, etc)
  3. while you carefully move/transition the old content/functionality to your new solution

My guess is that time needed for any of the approaches will be about the same.  Approach ‘C’ makes sense if you have lots of time and limited resources.   Approach ‘B’ is probably the quickest solution – provided that you have the resources (staff or budget to hire) to fund it.  Approach ‘A’ is listed because I read it (or something similar) on the Drupal site – based on my experiences with Open Source the more steps involved the more likely that secondary problems will arise and delay reaching the goal.  Of course, your mileage will vary … 🙂  [Along the way I decide to take a ‘core data’ only approach to completing a Drupal migration/upgrade as listed in option ‘A’ above – I may create a new post for that test.]

Taking a deeper look at the database changes between Drupal 4.x and 6.x

Note that the module and theme changes (or missing modules and themes) will probably require more ‘work’ to resolve than the actual migration of ‘pure data’ – which I will define as table elements that are different, added or modified between any two database versions (table & element structures and their associated attributes.) While you may be able to move your data it won’t be very useful to have if you don’t have modules that can display/use the data…  In this case I am only after the ‘core’ data which is why I started by using default table structure from a new Drupal install; if successful this approach can get a basic site updated and running with the latest version of Drupal (or at least the current latest of 6.x) – note that this approach will still require additional work for any new ‘core’ tables – you may have to connect the data.

I decided to use the mysqlshow tool (combined with a custom Bash script) to easily extract the needed information and wind up with this summary

**** Table_with_No_Changes ****
 1. authmap
 2. forum
 3. role
 4. term_data
 5. term_hierarchy
 6. users_roles
 7. variable
 8. vocabulary
 9. vocabulary_node_types

 **** Table_WITH_Changes ****
 ################################
 #  1. access                   #
 #  2. accesslog                #
 #  3. aggregator_category      #
 #  4. aggregator_category_feed #
 #  5. aggregator_category_item #
 #  6. aggregator_feed          #
 #  7. aggregator_item          #
 #  8. blocks                   #
 #  9. book                     #
 # 10. boxes                    #
 # 11. cache                    #
 # 12. comments                 #
 # 13. files                    #
 # 14. filter_formats           #
 # 15. filters                  #
 # 16. flood                    #
 # 17. history                  #
 # 18. node                     #
 # 19. node_access              #
 # 20. node_comment_statistics  #
 # 21. node_counter             #
 # 22. node_revisions           #
 # 23. node_type                #
 # 24. permission               #
 # 25. sessions                 #
 # 26. system                   #
 # 27. term_node                #
 # 28. term_relation            #
 # 29. term_synonym             #
 # 30. url_alias                #
 # 31. users                    #
 # 32. watchdog                 #
 ################################

So I wind up with ~32 tables that need a more thorough review.  Changes that probably won’t matter are those where the data length/size is increased.   Tables that will need additional review and possibly custom pre-data-massage or post-import-massage are those that have added fields or altered field properties.  Once you identify what needs tweaking you ‘massage’ the old data into the new format an simply import it (at least, that’s the idea…)

After review the table list I get things down to 20 tables that are different and relative to a data import as opposed to the operation of the CMS (i.e. the ‘cache’ for a site is specific to the current version of software – while possible to port the cache it is simpler/quicker to leave it out since it will be re-auto-built by the CMS…)  The list of Drupal 6.x tables below will require detailed review that will lead to either decisions to use ‘NULL’ data for new fields or create ‘data massage code’ to fill in any missing data.

  1.   blocks      ### Major changes/revisions/restructure ###
  2.   book                    - ints, loss of 'weight' field #
  3.   boxes       ### New field 'title', tinyints to ints ###
  4.   comments    ### several fields 'gone', many int changes ###
  5.   files       ### server new fields added          ###
  6.   filters     ### some new fields                  ###
  7.   flood       ### some new fields                  ###
  8.   node        ### many changes including keys and new fields
  9.   node_access ### key changes and tinyints
 10.   node_comment_statistics  ### int change and one decrease??
 11.   node_counter             ### int change and one decrease??
 12.   node_revisions           ### key changes, ints
 13.   node_type                ### this is a new table <<<
 14.   permission               ### New key
 15.   system                   ### keys & ints
 16.   term_node                ### keys
 17.   term_relation            ### KEYS
 18.   term_synonym             ### KEYS
 19.   url_alias                ### KEYS
 20.   users                    ### Many changes, keys, new fields

I find that many of the tables with changes to keys (or tables that have had new keys added) may not need any intervention on my part (depending upon how the data is ‘introduced’ to the database.)  An example, any key that is ‘auto-increment’ should be auto-magically managed by the database.  After a detailed review/comparison of the selected tables (and noting that I might have missed a table for two…) I find that these tables are most likely not impacted and their data is either not relevant to porting or should port via a SQL import (CSV, XML may also work.)

  1. node_access ### tinyints & int
  2. node_counter  ### int change and one decrease??
  3. node_revisions   ### key changes, ints – data may not be needed
  4. node_type  ### this is a new table <<< – Ported data is not relevant – this is a ‘structure’ table.
  5. flood – add a ‘fid’ auto-increment key
  6. filters     – add a ‘fid’ auto-increment key
  7. permission      ### New key – autoincrement – SBNImpact
  8. term_relation  ### New Key – autoincrement – SBNImpact
  9. term_synonym   ### New Key – autoincrement – SBNImpact
  10. url_alias      ### New field – language – SBNImpact
  11. users          ### New field – signature_format – SBNImpact
  12. system – description field is now called ‘owner’, new field: info, – used by installed themes, modules – probably not portable
  13. node_comment_statistics  – changes indicate a change in how this table is used – new fields – may not port (and may not be needed…)

These tables will probably work best with minor data massage (Add or remove missing fields or simply import and leave new fields with default or ‘blank’ values):

  1. node – add fields:  language, tnid and translate, – typical types:  blog, book, page story, forum, image, flexinod-?
  2. comments   – lose ‘score’ and ‘users’

Now I am down to the final tables – all of which may/will require some ‘data massage’ (either pre or post import):

  1. blocks – Major changes/revisions/restructure ### – data connected with: themes, modules, pages
  2. book – partial port? | old: bid, nid, parent, weight | new: mlin, bid, nid?
  3. boxes – loss of field ‘title’, tinyints to ints; may require data massage
  4. files – add ‘uid’, status and timestamp info – SBNImpact but may need data massage
  5. term_node – new key: Vid – may need manual data massage

Ok – we now have enough information to test some data import – and we note before starting that without a theme that supports our ‘block types’ we won’t have normal access to the data…

How to proceed?  Use SQL export and import or CSV export and import?

I decide to review the documentation for Mysql that covers export/import operations – I have found that SQL data works fine for me (but CSV or XML would also be options.)  Now it is time to become re-aquainted with mysqldump and mysqlimport as well as with the export/import options available with PhpMyadmin (IMO one of the best Open Source projects – ever!)

I did my initial import test by first exporting data using PhpMyadmin – I had no problems importing data.  Next I tried using mysqldump and mysqlimport – I ran into some problems…  Since I have a number of tables to work with I want to automate both the export and the import so I need options for controlling both actions.  If, while importing SQL data you encounter Mysql errors like:

ERROR 1136 (21S01) at line 24: Column count doesn’t match value count at row 1

chances are that you are attempting to import data without specifying/matching field-columns.  When you create your export files make sure to include table field names – when matches occur all should be well – if a table contains a new field it will be ‘missed’ (contain a default value…)

At this point I have successfully exported from Drupal 4.x and imported to Drupal 6.x 20 key tables:

  1. Import successful for Database/Table: 'drupal-test/authmap'
  2. Import successful for Database/Table: 'drupal-test/forum'
  3. Import successful for Database/Table: 'drupal-test/role'
  4. Import successful for Database/Table: 'drupal-test/term_data'
  5. Import successful for Database/Table: 'drupal-test/term_hierarchy'
  6. Import successful for Database/Table: 'drupal-test/users_roles'
  7. Import successful for Database/Table: 'drupal-test/variable'
  8. Import successful for Database/Table: 'drupal-test/vocabulary'
  9. Import successful for Database/Table: 'drupal-test/vocabulary_node_types'
 10. Import successful for Database/Table: 'drupal-test/node'
 11. Import successful for Database/Table: 'drupal-test/node_access'
 12. Import successful for Database/Table: 'drupal-test/node_counter'
 13. Import successful for Database/Table: 'drupal-test/node_revisions'
 14. Import successful for Database/Table: 'drupal-test/flood'
 15. Import successful for Database/Table: 'drupal-test/filters'
 16. Import successful for Database/Table: 'drupal-test/permission'
 17. Import successful for Database/Table: 'drupal-test/term_relation'
 18. Import successful for Database/Table: 'drupal-test/term_synonym'
 19. Import successful for Database/Table: 'drupal-test/url_alias'
 20. Import successful for Database/Table: 'drupal-test/users'

For my test it seems that some of my Drupal 4.x tables simply contained no data.  Also, I had little luck with mysqlimport but I did use mysqldump.  A final, general note – if you export/import one table at a time you might have fewer issues with memory – you will also be working with smaller files so less disk space is needed.

All Right!  Time for my last five tables…  Drum Roooooollllllll……..

  1. Import successful for Database/Table: ‘drupal-test/term_node’
  2. Import successful for Database/Table: ‘drupal-test/blocks’
  3. Import error for Database/Table: ‘drupal-test/files‘: ERROR 1054 (42S22) at line 24: Unknown column ‘nid’ in ‘field list’
  4. Import error for Database/Table: ‘drupal-test/book’: ERROR 1054 (42S22) at line 24: Unknown column ‘vid’ in ‘field list’
  5. Import error for Database/Table: ‘drupal-test/boxes’: ERROR 1054 (42S22) at line 24: Unknown column ‘title’ in ‘field list’

Ok, as expected some data challenges are present – three tables will need some ‘massage’. One table at a time:

Drupal 4.x  'files' table:
mysql -e 'use drupal-4.x; describe files;'
+----------+------------------+------+-----+---------+-------+
| Field    | Type             | Null | Key | Default | Extra |
+----------+------------------+------+-----+---------+-------+
| fid      | int(10) unsigned | NO   | PRI | 0       |       |
| nid      | int(10) unsigned | NO   |     | 0       |       |
| filename | varchar(255)     | NO   |     |         |       |
| filepath | varchar(255)     | NO   |     |         |       |
| filemime | varchar(255)     | NO   |     |         |       |
| filesize | int(10) unsigned | NO   |     | 0       |       |
+----------+------------------+------+-----+---------+-------+

Drupal 6.x 'files' table:
mysql -e 'use drupal-test; describe files;'
+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| fid       | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| uid       | int(10) unsigned | NO   | MUL | 0       |                |
| filename  | varchar(255)     | NO   |     |         |                |
| filepath  | varchar(255)     | NO   |     |         |                |
| filemime  | varchar(255)     | NO   |     |         |                |
| filesize  | int(10) unsigned | NO   |     | 0       |                |
| status    | int(11)          | NO   | MUL | 0       |                |
| timestamp | int(10) unsigned | NO   | MUL | 0       |                |
+-----------+------------------+------+-----+---------+----------------+

Since the new ‘fid’ field is an auto_increment field we will ignore it.  The status and timestamp fields will also be ignored during import (but the status field will need to be updated at some point  – or all files will be flagged as permanent…)  The ‘nid’ and ‘uid’ fields require ‘data massage’; rename ‘nid’ to ‘uid’ in your import file – after import you will need to change the ‘uid’ to a valid users.id, i.e. using the SQL option in PhpMyAdmin, after selecting the ‘files’ tables, I enter the following SQL commands to ‘fix’ my ‘uid’ field (assigning all files to the ‘admin’ user for this Drupal install):

UPDATE `files`
SET `uid`=1
WHERE `uid` > 1

Note that you will also need place the actual files in your files table on the filesystem for your server, usually in the ‘/files/’ folder below your Drupal install…

Next I decide that the ‘book’ table will not ‘port’ easily since the structure has changed quite a bit from4.x – this will require manual adjustments – for now, we will leave it out

ONE FILE LEFT!

mysql -e 'use drupal-4.x; describe boxes;'
+--------+--------------+------+-----+---------+----------------+
| Field  | Type         | Null | Key | Default | Extra          |
+--------+--------------+------+-----+---------+----------------+
| bid    | tinyint(4)   | NO   | PRI | NULL    | auto_increment |
| title  | varchar(64)  | NO   |     |         |                |
| body   | longtext     | YES  |     | NULL    |                |
| info   | varchar(128) | NO   | UNI |         |                |
| format | int(4)       | NO   |     | 0       |                |
+--------+--------------+------+-----+---------+----------------+
mysql -e 'use drupal-test; describe boxes;'
+--------+------------------+------+-----+---------+----------------+
| Field  | Type             | Null | Key | Default | Extra          |
+--------+------------------+------+-----+---------+----------------+
| bid    | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| body   | longtext         | YES  |     | NULL    |                |
| info   | varchar(128)     | NO   | UNI |         |                |
| format | smallint(6)      | NO   |     | 0       |                |
+--------+------------------+------+-----+---------+----------------+

Hmm, we ‘lost’ a field between these versions of Drupal.  If I use the same export/import process I get this error:  Import error for Database/Table: ‘drupal-test/boxes’: ERROR 1054 (42S22) at line 24: Unknown column…

I decide to skip this table as well – it is connected via blocks & themes so it would only be of value if I had a ported theme that used the table…

Hmm – DONE!  Well, at least with this part of the ‘Drupal port process’.  I now have ~ 4000 ‘nodes’ of useful content along with the user base from the original install. Custom options, modules and a new theme are still needed but this approach may provide a reasonable start, especially if you are using a default theme…

So, why do any of this?  In this instance a problem on a friend’s web led me to explore a bit – for anyone actually considering such a Drupal port this post may provide enough information to encourage/discourage you (or hire me! – in a few hours your ‘core’ might be ported) in your effort on a similar project. As usual, your mileage will vary.  🙂

Helpful hint – BEFORE exporting your Drupal 4.x data change your THEME to a standard theme that is present in your target version of Drupal – OR – port the theme first… and REMEMBER, this is NOT a complete solution – just a starting point, especially if you have custom themes, modules or other code to also deal with.

Need help with a similar HTML to CMS, small or large project?  (hourly or project based rates.)

Topics: Computer Technology, Problem Solving, Unix-Linux-Os, Web Problem Solving, Web Site Conversions, Web Technologies | Comments Off on Drupal 4.x migration to Drupal 6.x

Comments are closed.


________________________________________________
YOUR GeoIP Data | Ip: 73.21.121.1
Continent: NA | Country Code: US | Country Name: United States
Region: | State/Region Name: | City:
(US only) Area Code: 0 | Postal code/Zip:
Latitude: 38.000000 | Longitude: -97.000000
Note - if using a mobile device your physical location may NOT be accurate...
________________________________________________

Georgia-USA.Com - Web Hosting for Business
____________________________________