contentful-transform

v1.4.0

Published

4 years ago

Applies a transformation to each matching entry in a contentful space, and exports the resulting entries in a format that can be imported using `contentful-import`

0High
0Medium
0Low

gburgett

Contentful-Transform

Applies a transformation to each matching entry in a contentful space, and exports the resulting entries in a format that can be imported using contentful-import

npm install --global contentful-transform

Usage

$ contentful-transform --help
contentful-transform [options] <transform>

Commands:
  contentful-transform transform  The transformation to apply

Options:
  --help              Show help                                        [boolean]
  --version           Show version number                              [boolean]
  -s, --source        The source file, or space ID to load. "-" indicates stdin.
                                                                  [default: "-"]
  -a, --access-token  The contentful access token to use
  -o, --output        The output file to write to.  Default stdout.
  -c, --content-type  The content type to query for when loading from a space ID
  -q, --query         An entry filter query used when loading from a space ID
  -f, --filter        A filtering function to apply after loading the data.
  -r, --raw           Accept input & write output as a newline-separated stream
                      of objects rather than wrapped in the Contentful
                      export/import format                             [boolean]
  -x, --quiet         Do not output task progress                      [boolean]

Examples:
  cat contentful-export.json | contentful-transform     processes the file from stdin and
  'url=url.replace(//$/, "")'               trims trailing slashes from URLs
  contentful-transform -s contentful-export.json -f     adds a new field to every entry in
  'sys.contentType.sys.id=="foo"'           the given file matching the 'foo'
  '_entry.fields.new_field["en-US"]="somet  content type
  hing new"

A transform is either a javascript expression, or a nodejs module which exports a transformation function. When given as a string on the command line, the expression is evaluated in the context of each entry using the following template:

function (_entry, sys, field1, field2, field3...) {
  ${xform}

  _entry.fields["field1"]["en-US"] = field1;
  _entry.fields["field2"]["en-US"] = field2;
  _entry.fields["field3"]["en-US"] = field3;
  ...
  return _entry;
}

This allows you to specify simple transformations inline. For example, to strip all trailing forward slashes from a URL field, you could apply the following transform:

$ contentful-transform -s <space ID> -o transformed.json 'url=url.replace(/\/$/, "")'

When the transform refers to a nodejs module or file, the given nodejs file should export a transformation function that accepts the entry as the first argument. The function can modify the given entry object in-place, or return a new entry.

/* ./transform.js */
module.exports = function (entry) {
  ...
}

$ contentful-transform -s <space ID> -o transformed.json ./transform.js

Applying the results

Only entries that have actually changed will appear in the resulting json file. This file is in a format that the contentful-import tool understands, so you can apply the transformation to your space using that tool:

$ contentful-import --space-id <space ID> --content-file transformed.json

Advanced usage

Queries

If your --source is a space ID and not a contentful-export file, you can apply a query to limit the entries that will be processed. If you do so, you must also provide a --content-type:

$ contentful-transform -s <space ID> -o transformed.json \
    -c post -q 'fields.author=jp' \
    'author="Johnny Perkins"'

The above will only download and transform entries of the post content type whose author field matches the string jp.

Filters

If you need to filter results before the transformation is applied, but the filter cannot be expressed as a query or needs to be applied to multiple content types, you can provide the filter as a Javascript expression or module.

As an expression, the provided string is evaluated in the following function template and must return a truthy or falsy value:

function (sys, field1, field2, field3) {
  return ${filter};
}

# Add an "Author" field to the entry based on who published it
$ contentful-transform -s <space ID> -o transformed.json \
    -f 'sys.publishedBy.sys.id == "49x3q2YqBkpAasHXDVZbJY"' \
    '_entry.fields.author["en-US"] = "Johnny P"'

If provided as a module, the file must export a function which accepts the entry as an argument and returns a truthy or falsy value:

/* ./filter_published_before_2017.json */

var cutoffDate = Date.parse('2018-01-01')
module.exports = function (entry) {
  return Date.parse(entry.publishDate) < cutoffDate;
}

# Add a "?legacy" query parameter to every blog post published before 2017
$ contentful-transform -s <space ID> -o transformed.json \
    -f ./filter_published_before_2017.json
    'url = url + "?legacy=true"'

Using stdin as a source

If you don't specify a --source or --output, contentful-transform will read from and write to stdout rather than connecting to Contentful to download entries. Which means you can do any of the following:

# process the results of a curl and redirect output to 'results.json'
$ curl -H "Authorization: Bearer $CONTENTFUL_ACCESS_TOKEN" https://cdn.contentful.com/spaces/<space Id>/entries | \
    contentful-transform 'name = name.toLowerCase()' > results.json

# load a contentful export file
$ contentful-export --space-id <space> --management-token $CONTENTFUL_MANAGEMENT_TOKEN
  ...
  Stored space data to json file at: ~/contentful-export-7yx6ovlj39n5-2018-04-13T09-24-82.json
  
# read the file using the --source flag
$ contentful-transform -s ~/contentful-export-7yx6ovlj39n5-2018-04-13T09-24-82.json 'name = name.toLowerCase()' > results.json

# read the file by redirecting stdin
$ contentful-transform 'name = name.toLowerCase()' < ~/contentful-export-7yx6ovlj39n5-2018-04-13T09-24-82.json > results.json

Raw mode

In Raw mode, contentful-transform accepts input as a series of JSON objects separated by newlines. The output is also produced as JSON objects separated by newlines.

{"sys":{"space":{"sys":{"type":"Link","linkType":"Space","id":"4gyidsb" ... }
{"sys":{"space":{"sys":{"type":"Link","linkType":"Space","id":"4gyidsb" ... }
{"sys":{"space":{"sys":{"type":"Link","linkType":"Space","id":"4gyidsb" ... }

This is most useful on the command line, by piping it to other stream processors. For example, you could use this to power a cURL request for each entry.

#! /usr/bin/env bash
# $0: /usr/local/bin/update_events.sh

contentful-transform --raw -s <space ID> -c 'event' -f 'Date.parse(next_occurrence) < Date.now()' 'next_occurrence = new Date(Date.parse(next_occurrence) + 24 * 60 * 60).toISOString()' | \
  jq -r '.fields.title."en-US"' | \
  xargs curl -I{} -H 'Content-type: application/json' \
    --data '{"text": "Updating next_occurrence for {}", "channel": "#general", "link_names": 1, "username": "monkey-bot", "icon_emoji": ":monkey_face:"}' \
    https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

Wrap that up in a bash script and drop it in a cron job to get a Slack notification whenever an event's next_occurrence field gets updated.

$ chmod +x /usr/local/bin/update_events.sh
$ (crontab -l ; echo "0 0 * * * /usr/local/bin/update_events.sh > update_events.log 2>&1") | crontab -

The only downside with that is it has to download all the event entries from the space every time. If you have a lot of entries that could be a problem.

Infinite Streaming

contentful-transform is built using NodeJS Streams, which means that it can process infinite input. This is useful when processing input from stdin in raw mode. For example, you could have an infinite entry generator using the Sync API like the bash script in fixtures/infinite_sync.sh. Then, you could connect this up to contentful-transform to effect a transformation on every entry before copying it to another space with cURL.

$ infinite_sync.sh -s <space1> | \
    contentful-transform --raw '_entry.host["en-US"] = "backup.myapp.com"' | \
    upload_it_somewhere.sh -s <space2>

For stability this could be a linux service installed on your web host.

Hint: Producing a stream of newline-separated entries from the output of a CDN request is easy with jq using the --compact-output flag:
<curl> | jq -c '.items[]'

TODO:

[ ] - Still gotta figure out how to re-upload entries to a space after they've been transformed
[ ] - It might be useful in raw mode to write out every entry, regardless of whether it's been transformed. Could be a flag.
[ ] - Allow users to give an empty transform to just get the infinite streaming part? But what does do that jq does not?

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme