Sitecore SPE Data Sync Extension

Back in Februrary, I had the misfortune of having to miss Dallas' first Sitecore User Group meetup of the year, in which Michael West was going to be talking about Sitecore Powershell Extensions. At the time I was just getting familiar with PowerShell (and SPE) in general, so this was a session I was expecting to be able attend and learn a lot from. Knowing I wouldn't be able to make it, I reached out to Michael and asked him to record the session. The next day, I got a link to the video and was pretty blown away by what he had to show. Data Sync is an extension he had been working on to facilitate data imports from external sources. In the video, he coined it the "DEF killer"; he might just be right.

If you're looking for the tl;dr version of this post, hop on over to this section of the video or jump over to the repository to download and start playing with the tool. Otherwise, let's talk about some key features of Data Sync!!

At a high-level Data Sync is exactly what it sounds like; it is a tool that allows you sync data from an external source, with the destination being Sitecore. The module comes with 3 supported mechanisms for importing content:

SQL Script Import
Flat File Import
Web Service Import

Each of the three options derive from a base template where the majority of your options are defined, with the process being divided into three steps:

Step 1: Define your source data

Source Path (1.1) - where is your content being hydrated from; in the example screenshot this is a url to a web service endpoint. For file system imports, this can be a path to a file on the system. For SQL Scripts, the user is prompted to enter the SQL Query they want to run.
Paging Support (1.2) - if you're calling a web service, you can tell the process how to locate the next page of data.
Pre-process Script (1.3) - the module is expecting to iterate over a list of objects ($sourceItems). If you are getting something other than an array, you can pre-process the payload to get it to an 'importable' state.

Step 2: Define your mapping details

Destination Template (2.1) - Select from the content tree the template type for the content being imported
Item Name (2.2) - Define how to determine the name of the content item, based on the data you are importing
Display Name (2.3) - [optionally] specify a display name for the new item
Field-to-Data Mapping (2.4) - Specify the field(s) in the template, and how to map the data from your $sourceItem
Existing Item Detection (2.5) - Specify a field on the template to examine to determine if the imported item already exists; you can use a the checkbox above it to overwrite or skip those items accordingly.

Step 3: Define your destination

Destination Path (3.1) - specify the base path for where your imported items will live; this is combined with the parent path expression
Parent Path Expression (3.2) - define an expression to determine where along the base path your items should live; this allows you import hierarchies, instead of just a flat level!!
Post-Processor (3.3) - Anything else you need to do after the import is complete, this is where you do it...

As mentioned, these are the key items of the Data Sync. There are a lot of fields I haven't highlighted. At this point, you might be asking "where's the magic?". The magic of this module is that IT'S ALL POWERSHELL. Each of the fields identified (and those that weren't called out) can process PowerShell scripts as their value. This means you can have complex and unique processors, including completely separate methods, for how to calculate/set field values. With a little bit of PowerShell knowledge, the user has full control over how their data gets into their system, all without the need to compile code or deploy to an environment.

Creating and executing your imports is super easy, too. Within the content tree of the module, there is an imports node housing the import sets. Each import is a grouping of import steps you want to have executed when you kick of the import. These import steps are processed in the order they are defined in the tree, allowing you to resolve dependencies as needed. When you choose to create a new import step, you are prompted to pick the type and give a name. Fill out the key fields defined above, and you're ready to fire it off.

Starting the import is pretty simple as well. The hooks for executing an import step are located in the SPE Toolbox under the Sitecore Faux Start Button. When you choose to execute an import, you are prompted to select the import set to first off, and it goes.

It even provides a handy output to tell you how things went.

The last thing you might find yourself asking is "why is this guy writing about a module he didn't write?". Two reasons:

This thing is awesome. I think it has a ton of potential and practical use-cases, especially in the realm of content migration scenarios
In practical usage, I had the opportunity to work with Michael on contributing to the module. Through my own usage of the module, I was able to come up with a few quality enhancements that I am hoping will be integrated into the final package.

This module was my first big step into SPE and PowerShell, and it has taught me many things about both. Keep an eye out on my upcoming blog series where I talk about how I was able to leverage the module to do a full content migration from Drupal to Sitecore, including the import of content in 11 different languages.

Sitecore PowerShell strikes again. It trully is a great and powerful tool. If you aren't using it in your solution, you should definitely re-think some decisions you have made. Happy Sitecore-ing!!!

Sitecore SPE Data Sync Extension

Related Blogs

Latest Blogs