Skip to main content

Varicent ELT Help Center

Extract

Abstract

Extract text from the value using a regular expression.

Use the Extract tool when you want to extract text in specific ways using regular expressions. You can also use the Text part tool to extract text.

You can use some of the following expressions to extract text:

  • To extract digits, use (\d+).

  • To extract all of the text, use (.*).

  • To extract everything before a comma, use (.*),(.*).

  • To extract email domain, use (?<=@)[^.]+(?=\.).

Using the Output column(s) option, you can add new columns or replace an existing column in your data set. The columns append to your data set, with the naming convention of (column)_extracted, for example: Monthly Sales_extracted.

Input

The Extract tool requires one data input.

Configuration

Use the following configuration options to configure the Extract tool.

Configuring the Extract tool
  1. Go to the Pipes module from the side navigation bar.

  2. From the Pipes tab, click an existing pipe to open, or create a new pipe. To create a new pipe, read the Creating a pipe documentation.

  3. In the Pipe builder, add at least one data source to your pipe. For more information on adding a data source, see the Data tool.

  4. Click symon_add_icon.png + Tool.

    The Tools modal opens where you can add tools, such as the Aggregate tool to your pipe.

  5. In the Tools modal, search for Extract and then click + Add tool.

    Tip

    You can also find the Extract tool in the Clean section.

  6. Click the tool node and drag the line to the next tool to connect the tools. If you need to undo the action, click the line and then click Unlink.

  7. In the configuration pane, enter the following information:

    Table 52. Extract tool configuration

    Field

    Description

    Column

    Select the text column to use to extract the value.

    Regular Expression Pattern

    Enter the Regular Expression Pattern to use on the text column:

    • To extract digits, use (\d+).

    • To extract all of the text, use (.*).

    • To extract everything before a comma, use (.*),(.*).

    • To extract email domain, use (?<=@)[^.]+(?=\.).

    Advanced section

    Output column

    Select the type of output column from the drop-down list, either Adds a new column or Replace selected column.

    New column name

    If you select Adds a new column, enter the new column name, or leave blank and the tool automatically populates a name, such as CompPlanID_extracted.

    Note

    Any name you create will have _extracted appended.

    If you select Replace selected column, the New column name will not appear as you aren't creating a new column.