In previous lessons, you learned about the WHERE clause, which allows you to filter individual records in your data. You also learned about aggregating and summarizing data using GROUP BY. What if you want to filter your data after it has been aggregated?
What does HAVING do?
The HAVING clause in SQL is used to filter groups created by the GROUP BY clause based on a specified condition. It works similarly to the WHERE clause but it is applied to groups rather than individual rows. For example, HAVING can specify conditions involving aggregate functions like SUM, AVG, COUNT, etc. You cannot use the WHERE clause to evaluate aggregate functions.
Let’s start with a familiar example in Salesforce reports. In the following report example, what if I wanted to report countries that have >1000 locations? Only after aggregating all leads do you know this information. If you look at the example below USA has 2479 locations when we aggregate all states. We need a function that will evaluate the SUM of Number of Locations. That is where the HAVING function comes into play. It evaluates aggregated values so you can filter your data.
This is where the HAVING clause can help accommodate this requirement.
Before we apply the HAVING function, lets build SQL to represent this Salesforce report. You should give it a shot on your own to put your current SQL skills to the test.
If you landed on something like the following, we can work forward to injecting the HAVING logic once you can produce the following result.
SELECT "Country",
SUM("AnnualRevenue") "Annual Revenue",
SUM("NumberofLocations__c") "Number of Locations",
Count ("Id") "State",
Count (DISTINCT "State") "State"
FROM SFDC.STAGE."Lead"
GROUP BY "Country"
Implement HAVING
Usage with GROUP BY: HAVING is used after the GROUP BY clause. While GROUP BY groups rows that have the same values in specified columns, HAVING filters these groups based on a condition.
To address the question posed above we would implement a HAVING clause like this:
HAVING "Number of Locations" <1000;
SELECT "Country",
SUM("AnnualRevenue") "Annual Revenue",
SUM("NumberofLocations__c") "Number of Locations",
Count ("Id") "State",
Count (DISTINCT "State") "State"
FROM SFDC.STAGE."Lead"
GROUP BY "Country"
HAVING "Number of Locations" <1000;
The final result would look like the following, omitting USA as there are more than 1000 locations.
Working with multiple HAVING Clauses
Like previous labs, you can apply multiple conditions within one clause. I have provided a couple of examples.
HAVING “Number of Locations” <1000 AND State > 0
HAVING “Number of Locations” <1000 OR State < 5
HAVING COUNT(*) > 1
Conclusion
In conclusion, mastering the HAVING clause in SQL opens up powerful capabilities for filtering aggregated data. While the WHERE clause is essential for filtering individual records, the HAVING clause steps in to refine your results after data has been grouped using the GROUP BY clause.
By understanding how to use the HAVING clause effectively, you gain the ability to set conditions based on aggregated values, such as sums, averages, counts, and more. This allows for nuanced data analysis and reporting, particularly useful when dealing with complex datasets where filtering based on aggregated results is necessary.