Column

Split

The split task allows users to divide the content within a selected column into two separate columns based on a specified separator provided by the user. This functionality is useful for separating combined or delimited values into distinct columns for further analysis or processing.

Example Usage

In a dataset containing full names in a single column, the split task can be applied with the space character (" “) as the separator to divide the names into separate “First Name” and “Last Name” columns, enabling easier identification and manipulation of individual names.

Merge

The merge task enables users to combine the values from two columns into a single column, using a provided separator. This functionality is helpful for consolidating related information from separate columns into a unified format.

Example Usage

In a dataset containing addresses, the merge task can be used to combine the “Street Address” and “City” columns into a single “Full Address” column, facilitating easier reference and analysis of complete addresses.

Reorder

The reorder task allows users to change the order of columns within a dataset. This functionality is valuable when the original column order is not conducive to analysis or presentation, allowing users to rearrange columns to better suit their needs.

Remove

The remove task enables users to delete non-essential columns from a dataset. This functionality is essential for decluttering datasets and removing redundant or unnecessary information.

Insert

The insert task allows users to add new columns to a dataset without providing initial content. This functionality is useful for creating placeholders for additional information that may be populated at a later stage of data processing.

Duplicate

The duplicate task enables users to clone selected columns within a dataset. This functionality is valuable when users need to retain the original version of a column while making modifications or transformations to a duplicate version.

Similarity

The similarity task calculates the similarity between values in two columns using various algorithms such as Levenshtein distance, Jaro-Winkler distance, Jaro similarity, Jaccard similarity, and Hamming distance. This functionality is useful for comparing textual or categorical data to identify similarities or patterns.