Helpful regular expressions

ReplaceStringRegex: Replace with matching group can be used

If your regular expression contains groups, specified by parenthesis (), you can use the matched value inside the group in the replace function. You can reference a matching group with \with n as the number of the group.

For example:

Input string: foo 1234|qwerASDF 23bar

Regular expression: ([0-9]+)\|([a-zA-Z])

Will match: foo 1234|qwerASDF 23bar

Replace with: \1\2

Result: foo 1234qwerASDF 23bar

Explanation: The ReplaceStringRegex function will replace the complete match of the regex, e.g. 1234|q, with the matched value of the first group ([0-9]+), e.g. 1234, and the matched value of the second group ([a-zA-Z]), e.g. q.

This is how the example would look like in migration-center:

 

Replace consecutive chars with only one char

If you would like to replace consecutive chars, e.g. several consecutive periods in a file name, with only one char, e.g. only one period, you can use the following regular expression:

(<enter matching char here>)\1+ 

For example, if you would like to replace all consecutive periods in a file name with only one period, your transformation rule should look like:

Capture.PNG

I used [\.] as matching character in the example to match the period character (note that . is a special character in regular expressions and thus needs to be escaped by \).

An input value of aaa...bbb...ccc.pdf will be converted to aaa.bbb.ccc.pdf by the above replace function.

 

Matching control characters 

If you would like to match control characters, such as carriage return (CR) or line feed (LF), you should use the predefined character class cntrl.

For example:

The regex [[:cntrl:]] will match any control character.

 

Get last value of a repeating attribute 

With two processing steps you can get the last value of a repeating attribute, no matter how many values are stored in the attribute.

  1. Use the RepeatingToSingleValue function with a separator that is not in the values list (the pipe character ("|") usually works), in order to transform all repeating attribute values into a single value.
  2. Then use the SubstringRegex function with the regex [^\|]+$ in order to fetch only the last value.

 

Remove duplicates in a repeating attribute 

There is no out of the box way for removing duplicates. However, it can be achieved by repeated application of the same regular expression. Here are the steps:

  1. Use the RepeatingToSingleValue function to convert your repeating attribute to a single string separated by the pipe character | (a different character can be used, but then you also need to replace the pipe in the regex).
  2. Apply the following regular expression ([^\|]+\|)(.*\|*)\1 on the pipe separated values and then keep applying the regex on the result for as many times as needed (because the way the regex works, this needs to be done as it cannot replace all nested duplicates at once).
  3. Use the SingleToRepeatingValue again to convert your characters.

  

Extract the filename from a path 

Use SubstringRegex with \\[^\\]+$ regular expression.

 

Extract the file extension from a path

Use SubstringRegex with \.[^\.]+$ regular expression.

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.