Skip to content

Drupal content migration

You can find out more about the general approach to the migration, see the following Google Doc: Tate CMS: Content Migration Plan

Users are imported manually using the user import feature built into the site, so that process isn't covered here.

Taxonomies from Drupal have mostly been recreated using Django's enum types, with the occasional one using a model that is automatically populated via a Django data migration, so those aren't really covered here either.

Tate don't really use Drupal's 'tagging' features, so there aren't any tag-library migration steps for this project.

Each step of the migration is essentially a management command that needs to be run to import data, and they should be run in the order specified in order to allow references to the correct objects to be replaced during the import.

Import commands

The following commands are to be run in order:

1. import_documents

$ python manage.py import_documents

Populates the Wagtail document library with 'downloads' from the Drupal site.

This site uses the Tate API's 'nodes' endpoint with the 'facets' filter used to identify nodes of correct type.

The 'download' nodes in Drupal are more like pages that just display the download button to the file itself. Because we are not recreating these 'download' pages in Wagtal, for each 'download' that is encountered, a redirect is created, so that any visits to the original URL are captured and redirected to the file's download URL.

2. import_sponsors

$ python manage.py import_sponsors

Adds/updates sponsors.Sponsor snippet values to reflect the 'sponsor' nodes in Drupal.

3. import_galleries

$ python manage.py import_galleries

Adds/updates galleries.GalleryPage objects to reflect some of the 'gallery' nodes in Drupal (Tate-run galleries only).

4. import_event_venues

$ python manage.py import_event_venues

Adds/updates events.EventVenuePage objects to reflect the non-tate 'gallery' nodes in Drupal.

5. import_venues

$ python manage.py import_venues

Adds/updates galleries.VenuePage objects to reflect the 'venue' nodes in Drupal (all of which live below one of the galleries.GalleryPages created by import_galleries).

6. import_events

$ python manage.py import_events --resilient

Adds/updates events.EventPage objects to reflect the 'event' nodes in Drupal (all of which live below one of the events.EventVenuePages created by import_event_venues).

7. import_displays

$ python manage.py import_displays --resilient

Adds/updates displays.DisplayPage objects to reflect the 'display' nodes in Drupal (all of which live below a displays.DisplayListPages page, that should exist below each relevant galleries.GalleryPage page).

8. import_display_rooms

$ python manage.py import_display_rooms --resilient

Adds/updates displays.DisplayRoom objects to reflect the 'display' nodes in Drupal that are sub-pages of another display. All of these pages should live below one of the displays.DisplayPages created by import_displays.

9. import_tatecool_pages

$ python manage.py import_tatecool_pages --resilient

Adds/updates a mix of standardpages.GeneralPage, home.HomePage, art.ArtLandingPage, press.PressLandingPage, galleries.VisitLandingPage, tate_etc.TateEtcLandingPage, tate_papers.TatePapersLandingPage pages to reflect the 'tatecool_page' nodes in Drupal.

10. import_pages

$ python manage.py import_pages --resilient

Adds/updates standardpages.GeneralPage objects to reflect the 'page' nodes in Drupal.

11. import_projects

$ python manage.py import_projects --resilient

Adds/updates projects.ProjectPage objects to reflect the 'project' nodes in Drupal.

12. import_articleseries

$ python manage.py import_articleseries --resilient

Adds/updates tate_etc.TateEtcIssuePage and tate_papers.TatePapersIssuePage objects to reflect the 'articleseries' nodes in Drupal.

13. import_subpages

$ python manage.py import_subpages --resilient

Adds/updates standardpages.GeneralPage objects to reflect the 'subpage' nodes in Drupal.

14. import_tatecool_articles

$ python manage.py import_tatecool_articles --resilient

Adds/updates a mix of articles.ArticlePage, articles.ScholarlyArticlePage, glossary.ArtTermPage, press.PressReleasePage, tate_etc.TateEtcArticlePage, and tate_papers.TatePapersArticlePage objects to reflect the 'tatecool_article' nodes in Drupal.

15. import_articles

$ python manage.py import_articles --resilient

Adds/updates a mix of articles.ArticlePage and articles.ScholarlyArticlePage objects to reflect a subset of 'article' nodes in Drupal ('in_focus' articles are handled separately).

16. import_in_focus_pages

$ python manage.py import_in_focus_pages --resilient

Adds/updates a mix of in_focus.InFocusIssuePage and in_focus.InFocusArticlePage objects to reflect a subset of 'article' nodes in Drupal.

17. import_tateetcarticles

$ python manage.py import_tateetcarticles --resilient

Adds/updates tate_etc.TateEtcArticlePage objects to reflect the 'tateetcarticle' nodes in Drupal.

18. import_tatepapers

$ python manage.py import_tatepapers --resilient

Adds/updates tate_etc.TatePapersArticlePage objects to reflect a few of the 'tatepapers' nodes in Drupal.

19. import_press_releases

$ python manage.py import_press_releases --resilient

Adds/updates press.PressRelease objects to reflect the 'press_release' nodes in Drupal.

20. import_kids_articles

$ python manage.py import_kids_articles --resilient

Adds/updates kids.KidsArticlePage objects to reflect the 'tatecool_kids_article' nodes in Drupal.

21. import_kids_artworks

$ python manage.py import_kids_artworks --resilient

Adds/updates kids_artwork.Artwork objects to reflect the 'kids_artwork' nodes in Drupal.

Useful options

--resilient: Use this when you want the import to continue processing rows, even when errors occur (invaluable for large imports).

--resume: Use this to start a fresh import from where it got to last time (say, if it broke half-way though due to a server timeout or other problem).

--start-row: Use this when you want to start from a row that is different to --resume.

--stop-row: Use this when you want to hault processing after a certain row.

--verbosity: Use verbosity=3 to get even more feedback about what the imported is doing (e.g. looking for pages to replace anchor links with, looking/downloading an image enountered in content), and a summary of issues faced when parsing streamfield content (e.g. page, document and image references and links that couldn't be updated).

--force-update: When you want to want content to be re-imported even if it looks like there have been no changes since last time.

Fixup commands

The following commands are to be run in order:

1. fixup_information_architecture

$ python manage.py fixup_information_architecture

2. fixup_page_content

$ python manage.py fixup_page_content

3. fixup_image_captions

$ python manage.py fixup_image_captions