Last Updated on January 10, 2024 by Jake Sheridan
How to Clean Data in Google Sheets
Struggling to make sense of messy data in Google Sheets?
Enter: data cleaning.
Data cleaning is the process of identifying and fixing errors and inconsistencies from a dataset to improve its quality and reliability.
Whether you’re a beginner or a seasoned Sheets user, these techniques will help you transform chaotic spreadsheets into clear, actionable sheets.
That’s the plan at least.
1. How to use the Cleanup suggestions tool in Google Sheets
Cleaning your dataset is a crucial step in proper data analytics and reporting.
Unclean data could lead to inaccurate conclusions and can lead you to make incorrect decisions based on your data.
Luckily, Google Sheets comes with a Cleanup suggestions assistant that can scan your dataset for potential fixes for issues in your dataset.
These can include inconsistent formatting, unnecessary whitespace characters, and duplicate rows.
To use the Cleanup suggestions tool in Google Sheets, just follow these steps:
Open your Google Sheets document.
Ensure you are viewing the sheet where you want to apply cleanup suggestions.
This sheet should already have some data for the suggestions to work on.
In this example, we have a raw dataset containing order information for an online T-shirt store.
After a quick glance, you may notice that the dataset contains duplicate transactions (based on the unique ID in column A), unnecessary whitespace characters (in column C), and inconsistent formatting (in column D).
Click on the Data menu in the top menu bar.
From the dropdown list, look for the Data cleanup option.
Under this option, click on the Cleanup suggestions tool.
Google Sheets will analyze your data and provide suggestions such as removing duplicates, fixing text formatting, or identifying inconsistent data.
Google Sheets should now have a Cleanup suggestions panel on the right side of the page.
This panel will show all the cleanup suggestions for your data.
At the top of the panel, you can use a dropdown menu to sort through the suggestions by the column they’re trying to clean.
By default, this is set to All columns.
In our example, the Cleanup suggestions tool spotted several duplicate rows as well as columns with multiple formats.
The user can choose to ignore or accept these changes.
The Cleanup suggestions tool can also spot cells with leading and trailing whitespace characters.
Users can click the “Trim all” function to remove all these whitespace characters from the dataset.
After accepting all the cleanup suggestions you find necessary, you should now have a cleaner, more uniform dataset to work with.
Do note that while the Cleanup suggestions tool is helpful, it won’t be able to catch all possible issues with your dataset.
We recommend analyzing your dataset further for more opportunities to clean your data.
2. How to Remove Duplicates
When performing data analysis, duplicate entries will likely skew your data which can lead to inaccurate insights and models.
By removing duplicates, you ensure that each data point is unique, providing a more accurate and reliable basis for analysis, reporting, or decision-making.
We can remove duplicates in Google Sheets using the built-in Remove duplicates feature that is part of the program’s group of data cleanup tools.
To remove duplicates in your data in Google Sheets, just follow these steps:
In the table above, we have a dataset of item sales with multiple duplicate entries we want to find and remove.
Select the range of cells where you want to find and remove duplicates.
To select an entire table, you can click on any cell in the table and use the keyboard shortcut Ctrl+A (Windows) or Cmd+A (Mac).
In our sample spreadsheet, we’ll select the range A1:C20.
Click on the Data menu in the top menu bar, then look for the Data cleanup option.
From the Data cleanup options, select Remove duplicates.
This option specifically targets duplicate entries in your selected range.
In the Remove duplicates dialog box, you can choose which columns to check for duplicates.
Make sure to check the option “Data has header row” to exclude the header from the duplicate search.
In this example, we’ll analyze all three columns in our selection.
This will ensure that the entire row (Product ID, category, number of sales) must match for it to be considered a duplicate entry.
Click on Remove duplicates to proceed.
Google Sheets will indicate how many duplicate rows were found and removed as well as how many unique rows in the selection remain.
Click on OK to proceed.
Do note that it is wise to keep a copy of your original data.
Removing duplicates using this method overwrites the original selection.
You should also remember that removing duplicates may affect formulas that reference the data range in question.
After removing duplicates, consider looking into your sheet’s formulas to see if their references need adjustments.
3. How to Trim Whitespace Characters
Whitespace characters, such as spaces, tabs, or new line characters, can lead to issues in data processing, sorting, and analysis.
For instance, extra spaces in text entries might cause sorting errors or mismatches in searches and lookups, leading to inaccurate results.
When cleaning your data, we recommend trimming these characters to ensure your data is uniform and behaves as expected.
To trim whitespace in Google Sheets using the data cleanup tool, you canfollow these steps:
Open your Google Sheets document that contains the data you want to clean up.
In the table above, we have a table containing a list of orders for a particular item.
Each order contains the color and model ordered.
We want to remove unnecessary white space in column A.
Select the range of cells where you want to trim the whitespace.
You can select a specific range, an entire column, or the entire sheet by clicking the gray corner square to the left of column A and above row 1.
In our current example, we’ll select the entire table (excluding the headers in row 1).
Click on the Data option in the menu bar. From the dropdown options, select Data cleanup which groups together various data cleaning tools.
From the dropdown options, click the Trim whitespace feature.
After clicking on the “Trim whitespace” option, Google Sheets will process your selected range and automatically remove all extra spaces.
This includes spaces at the beginning and end of the text in a cell, as well as any extra spaces between words.
Google Sheets will also notify you how many cells were affected by the Trim whitespace feature.
Click on OK to proceed.
In the table above, Google Sheets was able to remove all leading and trailing whitespace from our table.
Using the TRIM Function
If you wish to use a Google Sheets formula instead, we can use the ARRAYFORMULA and TRIM function together to trim whitespace from a range of values.
To trim whitespace in Google Sheets using functions, you can do the following:
First, select the cell where you want to output the cleaned data.
For this example, we’ll output our cleaned data starting at cell D14.
Next, we’ll use the TRIM function with an ARRAYFORMULA function wrapped around it.
We’ll use the data we want to clean as the sole argument.
In the example above, we used the formula =ARRAYFORMULA(TRIM(A14:B24).
Hit the Enter key to evaluate the formula.
The formula should return an array containing a modified copy of the target data without any whitespace characters.
Hopefully this guide has given you an overview of how to clean data
in Google Sheets.
What’s Next?Explore some of the other useful resources on Sheets for Marketers:
- Google Sheets Templates – A collection of over 200 Google Sheets templates for SEO, marketing, PPC and more.
- Google Sheets Tutorials – A growing collection of spreadsheet tutorials for doing (almost) everything in Google Sheets.
- Data Studio Templates – A collection of over 50 Google Data Studio templates for SEO reporting.