Last Updated on January 9, 2024 by Jake Sheridan
What is the Google Sheets QUERY Function?
The QUERY function is a dynamic and versatile function that allows you to execute queries on your spreadsheet data as if it were a database.
This function uses a query language similar to SQL (Structured Query Language) that lets users sort, filter, and analyze datasets in more sophisticated ways than standard spreadsheet functions allow.
As your data changes or updates, the results of the QUERY function automatically update to reflect these changes, making it ideal for dashboards and real-time data analysis.
The function simplifies complex data queries that would otherwise require multiple functions or extensive manual work. It can replace a combination of functions like VLOOKUP, HLOOKUP, FILTER, and SORT.
QUERY Function Syntax
The basic syntax of the QUERY function in Google Sheets is as follows:
=QUERY(data_range, query_string, [headers])
Here’s what each argument in this function is for:
- data_range refers to the range of cells that you want to query.
- query_string is a text string that follows the Google Visualization API Query Language. This string is where you specify the query to perform on the data. The query value must be enclosed in quotation marks or contain a cell reference to a cell with the query string.
- [headers] is an optional parameter that indicates the number of header rows at the top of the data range. If omitted, Google Sheets guesses the number of header rows.
To start using the QUERY function, you’ll first have to familiarize yourself with how the Google Sheets query language works.
This includes knowing the different clauses and keywords and how they’re used to determine the output of a query.
Each query is composed of one or more clauses yhat specify how to process and display your data. Each of these clauses should start with one or two keywords.
In order of, these are the available clauses for a query in Google Sheets:
- GROUP BY
- ORDER BY
Each type of clause has its own specific role and syntax, allowing for complex data manipulation directly within Google Sheets.
Note that not all clauses need to be specified in a query but all clauses used must be mentioned in the same order as listed above.
The QUERY function will not be able to parse the query string correctly if they are ordered incorrectly.
Aggregate functions allow you to perform calculations on a group of values, providing a way to summarize or aggregate your data.
Aggregate functions often go hand-in-hand with the GROUP BY clause in QUERY.
This clause groups your data based on the values in one or more columns, allowing aggregate functions to perform calculations on each group.
Here are the available aggregate functions you can use in a query:
Arithmetic operations in the Google Sheets QUERY function allow you to perform basic mathematical calculations on your data directly within a QUERY statement.
They are particularly useful when you need to transform, analyze, or calculate new values from your existing dataset.
The QUERY function supports:
- addition (+)
- subtraction (–)
- multiplication (*)
- division (/)
For example, let’s use the query string SELECT A+B:
The output will be a single column where each row value is the sum of the row’s corresponding value of A and B.
Similarly, the QUERY function will be able to subtract (SELECT A – B), multiply (SELECT A * B), and divide (SELECT A / B) columns together.
What to Consider Before Running a QUERY Function?
The QUERY function is a powerful tool that you can use to interact with and explore your data.
However, you may not be aware of some details about how the QUERY function interacts with your data.
Here are some factors to consider before running a QUERY function.
- Before creating your query, ensure you understand the structure of your dataset. Understand which columns are numerical or include dates and which columns hold text.
- Create a clear objective for your query. If your dataset is large, you may consider limiting the results to a particular time range or category.
- Always refer to columns by the column letter or by using ‘Col1’, ‘Col2’, and so on. When creating a query in Google Sheets, it is a common mistake to use the header names. For example, if your table contains headers like “name” or “price”, using “SELECT name, price” will result in an error.
- When adding a new column or rearranging your data, the query_string argument should always be updated since it is not automatically updated by Google Sheets.
- Queries in Google Sheets expect all data in a single column to have a single type. If your dataset contains columns with mixed data, consider cleaning your data first. If there are mixed data types in a single column, the majority data type will determine the data type of the column for query purposes.
- When using GROUP BY, remember that any column that is not part of the GROUP BY clause needs to be wrapped in an aggregator function.
SELECT Specific Columns
The SELECT clause allows users to select which columns to return and in what order.
If this clause is omitted from the query, the entire column will be returned in its original order.
Let’s take a look at a simple example of the SELECT clause in action.
Suppose you want to query the table above but you only want to output the id, last_name, and zip code fields.
We can use the query string SELECT A,B,G to return our dataset with just our desired columns.
To output all columns from our source dataset we can use the SELECT * query string.
The WHERE clause allows us to filter our dataset based on specified conditions.
Here are the basic logical operators you can use in the WHERE clause:
- Less than or equal (<=)
- Less than (<)
- Greater than (>)
- Greater than or equal (>=)
- Equal (=)
- Not equal (<>)
Let’s look at a sample QUERY function that uses the WHERE clause.
In the query above, we used the WHERE clause to only display data where the state (column F) is equal to “CA”
We can combine several conditions together using the and, or and not keywords.
In the query above, we used the AND keyword to ensure that only rows where the state (column F) is equal to “CA” and the id (column A) is lower than 1050.
The WHERE clause also supports more advanced operators such as starts with, ends with, contains, matches, and like.
These advanced operators allow you to compare text values and essentially acts as a search function.
In the example above, we used the “contains” operator to select all values where the city (column E) contains the letter “J”.
ORDER BY Keyword
The ORDER BY clause allows users to determine how to sort the output of a query.
The ORDER BY keyword must be followed by one or more columns to sort the data by either in ascending or descending order.
By default, ORDER BY sorts the data in ascending order, but you can explicitly add the keyword DESC to sort in descending order or ASC to sort in ascending order.
Let’s look at a sample QUERY function that uses the ORDER BY clause.
In the example above, we used the clause ‘ORDER BY B’ to sort our selection by column B (last_name).
By default, the order is in ascending order.
We can add the ‘DESC’ keyword at the end of our ‘ORDER BY’ clause to order by the indicated column in descending order.
The LIMIT clause is used to constrain the number of rows returned by a query.
It’s particularly useful in large databases where returning all rows would be inefficient and unnecessary.
By specifying a LIMIT, you can retrieve just a set number of rows based on the value you provide.
We can add an ORDER BY clause as well to ensure that the output shows the top or bottom n values based on a specific criteria.
Let’s take a look at a sample query that uses the LIMIT keyword.
In the example above, we added “LIMIT 5” to our query string to only show the first five rows of the resulting dataset.
Queries in Google Sheets support arithmetic operations such as addition, subtraction, division, and multiplication.
In the example above, we have two columns: id and Days Subscribed. We want to convert our Days Subscribed column to months by dividing each value by 30.
We can do this easily by adding a column H/30 to our SELECT clause.
In the example above, we want to combine the Total Pending Orders(column I) and Total Completed Orders(column J) into a single column by adding both values.
In our SELECT clause, we can use the ‘+’ operator to add the I and J columns together.
The LABEL clause allows you to assign temporary labels to a column of a query output.
This feature is useful for customizing the display of your results, especially when the original column headers are not descriptive or user-friendly.
The syntax for the LABEL clause is as follows: LABEL column ‘new_label’.
You can label multiple columns in a query byseparating each LABEL statement with a comma.
Let’s look at an example of the LABEL clause in action.
In the query used above, our SELECT clause divides column H by 30.
We can use the following query string to add new labels to our columns:
SELECT A,H/30 ORDER BY H DESC LABEL A ‘subscriber ID’, H/30 ‘months’
GROUP BY Keyword
The GROUP BY clause is essential when you want to aggregate data.
It groups rows that have the same values in specified columns into summary rows, like “find the total revenue per region.”
It’s used in tandem with aggregation functions (such as COUNT, SUM, or AVG) to perform calculations on groups of data values.
For example, let’s say we want to find the total number of completed orders for each given state using our sample dataset.
We can use the following QUERY function to output a table containing this data:
=QUERY(A1:J26,”SELECT F,COUNT(A) GROUP BY F”,1)
New columns made with aggregation functions will have default names indicating the aggregation function used and the column being aggregated.
We can use the LABEL clause to rename these new aggregate columns.
Aggregation functions in queries work by executing a particular operation across an entire column.
These aggregation functions typically appear in the SELECT, ORDER BY, LABEL, and FORMAT clauses.
Let’s go over each of the supported aggregation functions.
This function returns the average value of all values in the column within a group
In the query above, we grouped our data by state (column F) and used the AVG function to find the average number of completed orders per state.
This function returns the count of all elements in the column for a group. The count function ignores null cells.
In the query above, we grouped our data by state (column F) and used the COUNT function to find the total number of ids found for each state.
This aggregation function returns the maximum value in the column for a group
In our example above, we used MAX to find the highest number of completed orders for a single account per state.
This aggregation function returns the minimum value in the column for a group
In our example above, we used MAX to find the lowest number of completed orders for a single account per state.
The SUM function returns the sum of all values in the column for a group.
We used the SUM aggregation function in the query above to find the total number of completed orders broken down by state.