Skip to main content

Varicent ELT Assistant

Extract

Use the Extract tool when you want to extract text in specific ways using regular expressions. You can also use the Text part tool to extract text.

You can use some of the following expressions to extract text:

  • To extract digits, use (\d+).

  • To extract all of the text, use (.*).

  • To extract everything before a comma, use (.*),(.*).

  • To extract email domain, use (?<=@)[^.]+(?=\.).

Using the Output column(s) option, you can add new columns or replace an existing column in your data set. The columns append to your data set, with the naming convention of (column)_extracted, for example: Monthly Sales_extracted.

Input

The Extract tool requires one data input.

Configuration

Use the following configuration options to configure the Extract tool.

Configuring the Extract tool
  1. Go to the Pipes module from the side navigation bar.

  2. From the Pipes tab, click an existing pipe to open, or create a new pipe. To create a new pipe, read the Creating a pipe documentation.

  3. In the Pipe builder, add a data source to your pipe. For more information on adding a data source, see the Data Input tool.Data Input

  4. Click symon_add_icon.png + Tool.

    The Tools modal opens, where you can add tools, such as the  Aggregate  tool, to your pipe.

  5. In the Tools modal, search for Extract and then click + Add tool.

    Tip

    You can also find the Extract tool in the Clean section.

  6. Click the tool node and drag the line to the next tool to connect the tools. If you need to undo the action, click the line and then click Unlink.

  7. In the configuration pane, under Column, select the text column to use to extract the value.

  8. Under Regular Expression Pattern, enter the Regular Expression Pattern to use on the text column:

    • To extract digits, use (\d+).

    • To extract all of the text, use (.*).

    • To extract everything before a comma, use (.*),(.*).

    • To extract email domain, use (?<=@)[^.]+(?=\.).

  9. Expand the Advanced section, under the Output column, select the type of output column from the drop-down list, either Adds a new column or Replace selected column.

  10. Under New column name, If you select Adds a new column, enter the new column name, or leave blank and the tool automatically populates a name, such as CompPlanID_extracted.

    Note

    Any name you create will have _extracted appended.

    If you select Replace selected column, the New column name will not appear as you aren't creating a new column.

  11. Click on the tool name to rename your tool node to a meaningful name. Name your tools in a way that describes the function, not the object or the data action. For example, use “Look up rate” instead of “Join to rate table”.