Skip to main content

4.4.1 Filtering Groups with HAVING Clause

In previous lessons, you learned about the WHERE clause, which allows you to filter individual records in your data. You also learned about aggregating and summarizing data using GROUP BY. What if you want to filter your data after it has been aggregated?

What does HAVING do?  

The HAVING clause in SQL is used to filter groups created by the GROUP BY clause based on a specified condition. It works similarly to the WHERE clause but it is applied to groups rather than individual rows. For example, HAVING can specify conditions involving aggregate functions like SUM, AVG, COUNT, etc. You cannot use the WHERE clause to evaluate aggregate functions.

Let’s start with a familiar example in Salesforce reports. In the following report example, what if I wanted to report countries that have >1000 locations? Only after aggregating all leads do you know this information. If you look at the example below USA has 2479 locations when we aggregate all states. We need a function that will evaluate the SUM of Number of Locations. That is where the HAVING function comes into play. It evaluates aggregated values so you can filter your data.

This is where the HAVING clause can help accommodate this requirement.

Before we apply the HAVING function, lets build SQL to represent this Salesforce report. You should give it a shot on your own to put your current SQL skills to the test.

If you landed on something like the following, we can work forward to injecting the HAVING logic once you can produce the following result.

SELECT "Country",

SUM("AnnualRevenue") "Annual Revenue",

SUM("NumberofLocations__c") "Number of Locations",

Count ("Id") "State",

Count (DISTINCT "State") "State"

FROM SFDC.STAGE."Lead"

GROUP BY "Country"

Implement HAVING 

Usage with GROUP BY: HAVING is used after the GROUP BY clause. While GROUP BY groups rows that have the same values in specified columns, HAVING filters these groups based on a condition.

To address the question posed above we would implement a HAVING clause like this:

HAVING "Number of Locations" <1000;

SELECT "Country",

SUM("AnnualRevenue") "Annual Revenue",

SUM("NumberofLocations__c") "Number of Locations",

Count ("Id") "State",

Count (DISTINCT "State") "State"

FROM SFDC.STAGE."Lead"

GROUP BY "Country"

HAVING "Number of Locations" <1000;

The final result would look like the following, omitting USA as there are more than 1000 locations.

Working with multiple HAVING Clauses

Like previous labs, you can apply multiple conditions within one clause. I have provided a couple of examples.

HAVING “Number of Locations” <1000 AND State > 0

HAVING “Number of Locations” <1000 OR State < 5

 HAVING COUNT(*) > 1

Conclusion

In conclusion, mastering the HAVING clause in SQL opens up powerful capabilities for filtering aggregated data. While the WHERE clause is essential for filtering individual records, the HAVING clause steps in to refine your results after data has been grouped using the GROUP BY clause.

By understanding how to use the HAVING clause effectively, you gain the ability to set conditions based on aggregated values, such as sums, averages, counts, and more. This allows for nuanced data analysis and reporting, particularly useful when dealing with complex datasets where filtering based on aggregated results is necessary.

4.3.1 Sorting Data with ORDER BY

As a recap, we have walked through clauses in a SQL SELECT statement in order to lead us up to this point where we are now ready to sort data.

Clause NameDescriptionStatus
WHEREFilters the results to include only those rows that meet the specified condition.CHECK
GROUP BYGroups rows that have the same values in specified columns, often used with aggregate functions like SUM, AVG,MAX,MINCHECK
ORDER BYUsed to sort the result set in ascending or descending order

The ORDER BY clause in SQL is important for sorting the results of a query in either ascending or descending order. It enhances the readability and usefulness of your data by organizing it according to specific criteria.

There are a lot of nuances related to sorting and data types (integers, strings, dates) and the expected behavior for how data is sorted.

Data TypeDescriptionExample ValuesSort Order Example
VARCHARVariable-length character string‘California’, ‘Florida’, ‘Texas’ORDER BY column_name ASC (Alphabetical order)
NUMBERNumeric data type, including integer and decimal100, 200, 300ORDER BY column_name DESC (Descending numerical order)
DATEDate values‘2024-02-01’, ‘2024-02-15’ORDER BY column_name ASC (Chronological order)
TIMESTAMPDate and time values, including time zone information‘2024-02-01 08:00:00’, ‘2024-02-01 09:00:00’ORDER BY column_name DESC (Reverse chronological order)
BOOLEANBoolean values (TRUE or FALSE)TRUE, FALSEORDER BY column_name ASC (FALSE, then TRUE)

Sorting Salesforce Data

Typically, sorting is a highly requested function for any list or report as it provides sequencing and prioritization of information. Here are the typical cases why sorting is so powerful:

  1. Sequencing and ranking by numeric values.
  2. Ordering data in sequential order by date/time.
  3. Organizing and grouping records together visually when they are not contiguous inside of your database.

Lightning list views and reports will allow you to sort by one column in descending and ascending order as it is presented to end users. Let’s explore how to apply this same sorting behavior to your data when you create a query.

Coding Order By 

ORDER BY is typically used at the end of a SQL query to sort the results by one or more columns. For example, SELECT * FROM Customers ORDER BY LastName ASC; sorts customers by their last names in ascending order.

The basic ORDER BY syntax is a follows…

ORDER BY [Field Name] [ASC | DESC ] 

Here is an example that we can run and explore together:

SELECT "COUNTRY",
"STATUS", COUNT(*) as "Lead Count",
COUNT (DISTINCT "STATE") "State Count"
FROM SFDC.STAGE.LEAD
GROUP BY "COUNTRY", "STATUS"
ORDER BY "COUNTRY";

As you can see by default the countries are sorted in ascending order. If you want to experiment you can try to apply the same sort by country in DESCENDING order. 

Your ORDER BY clause will look like the following:

ORDER BY “COUNTRY” DESC;

Next, let’s modify the ORDER BY field so we can see how data is sorted by changing the State Count:

 ORDER BY “State Count”;

Now, you can see the data is sorted in ascending order. Let’s see what that looks like if we sort the data in descending order:

ORDER BY “State Count” DESC;

Order by Multiple Columns

What if we want to order by country and then State Count? We can order our data and effectively sort it by multiple columns. This is a level of control you won’t have at your disposal without putting data into a tool like Excel. Understanding and using Order By in this way also sets the stage for more advanced analysis scenarios.

ORDER BY “COUNTRY” DESC, “State Count” ASC;

3.4 Querying Lead Data Lab

Create your Own Select Statement to Query Lead Data

Let’s build upon what we already learned in Lab 1 and review the following SELECT statement: 

SELECT * FROM LEAD;

Select All Columns

To select all columns in your table, you will enter the asterisk symbol “*” to signal that you would like all columns returned in the resulting query.

If you apply what you learned previously and click “Play” in the upper right-hand corner you will get a result that is more interesting than our Hello World example. To get more familiarity with the results section of the worksheet, I have numbered and documented notable features that we will use:

Navigating Query Results in Snowflake

When you run a query that selects all columns by clicking the “Run” button in the upper right-hand corner you will get a result that is more interesting than our Hello World example. To get more familiarity with the results section of the worksheet, I have numbered and documented notable features that we will use:

Note for Best Practice: As you review the sample Salesforce data, it’s important to know there are important data governance practices that should be implemented when loading Salesforce data into a data warehouse like Snowflake. You would typically evaluate a data masking policy for PII, and certainly for sensitive PII. That goes beyond the scope of this training but is something you should be aware of as data moves between Salesforce and queryable data warehouses.

  1. Results – The results of your query are displayed in a tabular format. 
  2. Column List – Clicking on Select Columns on results is a visual way to navigate and search for specific columns that you may want in your query. 
  3. Download – Download the results to CSV or Excel.
  4. Code Version – This is one of my favorite features as Snowflake tracks every time you click “Run”, so you can revert to a previous version of your code.
  5. Query Details – Displays the total number of records in your query.
  6. Profiling – Provides basic statistics about each column so you know the fill rate and in some cases.

7. Scrollbar – The minimalistic user experience of Snowflake is nice, but I always find myself looking for the scrollbar. You have to hover in the right area to visually expose the horizontal scroll bar.

8. Database Explorer – Clicking on the Database tab will give you full visibility of your databases, schemas, and tables. This is particularly important as you start combining and joining tables.

Navigating Results of your Query

2:01

Database and Schema Declaration

In Snowflake we have our database table and schema defined as part of the workbook as shown below. However, if you plan to share your SQL or run it outside of this workbook you will find yourself without context of the database or schema. 

A simple way to address that is to include the Schema and Database with your table name as shown below:

SELECT "ID","NAME" FROM SFDC.STAGE.LEAD;

Formatting your Query in Snowflake Editor

To terminate or declare the end of your SQL statement, you will enter a semi-colon “;”. This is particularly important if you are using a SQL editor to work, experiment and quickly iterate more than 1 query within a single editor.

Length 00:32

1.5 Developer Org Setup

To successfully complete this course you will need access to a Snowflake trial and preferably a Salesforce org with an administrator role to follow along. We will not perform any changes, data insertion, or extraction from Salesforce. Instead, I will provide sample data pre-extracted from a Salesforce trial org that we will load into Snowflake together.

Let’s start with the Snowflake signup process. Snowflake provides a 30-day free trial with plenty of credits.

Important Note: From time to time, Snowflake will make small tweaks to their Sign Up process. We have documented the process as of January 2024.

Snowflake Demo Account

  1. Visit the Snowflake Signup Link https://signup.snowflake.com/.
  2. Fill in your contact information, role, and geography, then click “Continue”.
  3. Select “Standard” edition – You can always upgrade to enterprise later if you need enterprise features. For this training, you will not need any enterprise features.
  4. Select AWS and choose the geography closest to you – It’s important to note that you need no working knowledge of any of the cloud providers and will not need to access or perform any further actions. The cloud provider selection indicates to Snowflake what data center your sample data and queries will run in Snowflake. We will cover this in more detail in an upcoming lesson. 
  1. Fill in the Snowflake questionnaire – Snowflake will ask you a few qualifying questions to understand your use case. The answers have no impact on how Snowflake performs.
  1. Open your email and look for an email from Snowflake Computing (no-reply@snowflake.net).
  2. Click on the button or link to activate
  3. Set a username – A username can contain only letters and numbers, so you cannot use an email address.
  4. Set a password – Keep the username and password in a safe place. There is no email to reset a password for administrators like Salesforce. I typically use a client password manager like 1Password or LastPass.

Success! Now we should be logged into Snowflake. We will do some basic admin housekeeping steps in an upcoming lab when we import our lab data.

Capture your Login URL

While logged in, go to the lower left corner of the screen and click on the Snowflake icon. You will see the instance of your Snowflake and where it is located. If you hover and click on the “Link” icon, you will copy the URL to your Snowflake instance. Keep this URL handy, as it will be required to access your Snowflake instance to log in. In the event you use Snowflake, your account ID and Organization IDs are also listed.

Salesforce Developer Account (optional)

In this course, I will reference Salesforce admin or reporting screens regularly as an illustrative step to explain SQL concepts and functions. I do not have topics or labs that will require you to extract or input data into your Salesforce org, but I do access features that require admin access. 

As such, I list this as an optional step if you do not have a Salesforce org, don’t have admin rights, or simply want to keep all of your learning activities independent of your work Salesforce org. You can follow these steps to get a free developer account:

  1. Go to the Salesforce Trials Page: https://developer.salesforce.com/free-trials.
  2. Find the section dedicated to Salesforce Developer Edition and locate the “Sign Up for the Salesforce Developer Edition” button.
  3. Fill out the provided form with the necessary information, and then click the “Sign Me Up” button. Salesforce will send an email to the specified address, containing instructions on how to access your new Developer Org. Make sure that you provide a valid email address because Salesforce will send login instructions to that email.
  4. Choose a unique username for accessing your Developer Org – It should follow the format of your email address and be easy to remember. For example, you could use something like “youremail@gmail.com.sfbensql”.
  5. You will receive an email from developer@salesforce.com with the subject “Welcome to Salesforce: Verify Your Account”.
  6. Click on the “Verify Account” button – This action will open a new browser page where you can reset your password.
  7. Input a new password on the password reset page that adheres to the specified password requirements in the “new password” field. Repeat the same password in the “confirm new password” field. Select a security question and provide an answer. Then, click the “change password” button.