How to Leverage Tableau Prep for Data Cleaning

Use Tableau Prep to clean and prepare your data before importing it into Tableau Desktop. This tool allows for easy data shaping, cleaning, and merging, saving time and improving efficiency.

Here is step-by-step guide on how to use Tableau for common data cleaning tasks.

Step 1: Download and Install Tableau Prep

  1. Download Tableau Prep: Go to the Tableau website and download Tableau Prep.
  2. Install the Software: Follow the installation instructions to get Tableau Prep set up on your machine.
point8 picture.png?width=936&name=point8 picture

Step 2: Connect to Data

  1. Open Tableau Prep: Launch the Tableau Prep Builder application.
  2. Connect to Data Source: Click on the + icon next to Connections to connect to your data source (Excel, CSV, databases, etc.).
  3. Add Data: Select your data file or database, and then click Add to bring your data into Tableau Prep.
Tableau Connections 2 1024x551 1

Step 3: Inspect the Data

  1. Profile Pane: The data will load into the Profile Pane where you can see a summary of your fields and sample data.
  2. Identify Issues: Look for common issues such as missing values, duplicates, and incorrect data types.

Step 4: Clean the Data

  1. Remove Unnecessary Columns:
    • Click on the columns you don’t need and select Remove.
  2. Filter Data:
    • Click on a field to create a filter. You can filter out rows that do not meet certain criteria (e.g., remove rows where a specific column is null).
  3. Rename Fields:
    • Double-click the field name to rename it for better clarity.
  4. Change Data Types:
    • Click on the data type icon next to a field name to change its data type (e.g., from string to date).

Step 5: Handle Missing Values

  1. Fill or Replace Nulls:
    • Click on a field and select More Options (three dots) > Clean > Fill Nulls or Replace Nulls. You can choose to replace null values with a specific value, forward-fill, or backward-fill.
  2. Remove Rows with Nulls:
    • Click on More Options > Clean > Remove Nulls to delete rows with null values in a specific field.

Step 6: Remove Duplicates

  1. Aggregate Step:
    • Add an Aggregate step to group by fields that should be unique.
    • Click + > Aggregate, then drag fields you want to keep into the Group By section.
    • This will remove duplicate rows based on the selected fields.

Step 7: Split Columns

  1. Split Fields:
    • Click on a field that contains multiple pieces of information you want to separate.
    • Select More Options > Transform > Custom Split. Define the delimiter and number of splits.

Step 9: Combine Data

  1. Union:
    • Click + > Union to combine rows from different tables with the same structure.
  2. Join:
    • Click + > Join to merge tables based on a common field.

Step 10: Save and Export

  1. Save the Flow:
    • Click File > Save As to save your cleaning workflow.
  2. Export the Data:
    • Click the + icon next to the Output step.
    • Define the output settings (e.g., file type, location) and click Run Flow to generate the cleaned dataset.
Tags: No tags

Add a Comment

Your email address will not be published. Required fields are marked *