Skip to main content

8.6 Opportunity Query

With our Lead Query created, next, we will create an Opportunity query. Create a SELECT statement for SFDC.STAGE.OPPORTUNITY with a table alias of “O” and include the following fields:

Id, CreatedDate, Revenue, Stage, AccountId, CloseWonDate

If you find that the first attempt does not work perfectly, use the errors to help guide you along the way. Here are some tips:

  1. Did you include commas after the field name?
  2. Did you add the table Alias correctly?
  3. Did you declare you get single vs double quotes correct?
  4. Did you remember to alias your table and fields?

Opportunity Query Result:

SELECT 
O.Id,
O.CreatedDate,
O.ExpectedRevenue,
O.StageName,
O.AccountId,
O.CloseDate,
O.Type
FROM SFDC.STAGE.OPPORTUNITY O;

Revenue Pipeline

NameObject & FIeld OriginDate DimensionDescriptionDefinition
Revenue PipelineOpportunity.RevenueOppotunity.CreatedDateOpportunity RevenueTotal Revenue pipeline (open and closed opportunities)

Our next metric is called “Revenue Pipeline.” The distinction of pipeline indicates that we are measuring revenue at any Opportunity stage. This is a great example where having well-written definitions and a universal understanding of metrics and facts are very important. Because there are no implied constraints or calculations required, you can simply use the Opportunity Revenue field in your query. As it turns out we don’t need to change. As it turns out, the standard field as it stands within the opportunity is sufficient and already included in our query.

Result

O.ExpectedRevenue

New Customer Won Opportunities

NameObject & FIeld OriginDate DimensionDescriptionDefinition
New Customer Won OpportunitiesOpportunity.StageOppotunity.CloseWonDateOpportunity COUNT WHERE Stage = ‘Closed Won’ AND Type = ‘New Customer’Count of won opportunities for new customers

The next metric is “New Customer Won Opportunities” where you get to put your conditional expression skills to use again. This time, you have two conditions. Give it a shot on your own.

  1. Did you declare you get single vs double quotes correctly?
  2. Did you utilize AND / OR to add two conditions?

Result

CASE
WHEN O.StageName = 'Closed Won' AND O.Type = 'New Customer' then 1
ELSE 0 END "New Customer Won Opportunities"

Customers

NameObject & FIeld OriginDate DimensionDescriptionDefinition
Customers Opportunity.Id,
Opportunity.AccountId,Opportunity.Stage
Oppotunity.CloseWonDateDISTINCT Accounts with  WHERE Opportunity Stage = ‘Closed Won’Distinct count of Accounts where Closed-Won Opportunities > 0 

Our final metric is designed to measure the total number of distinct customers in Salesforce. A “Customer” in this case has a very specific definition that requires an Account to have at least 1 closed-won opportunity. Unlike the other metrics, I will share how we achieved this metric in Salesforce. I would need to build an Opportunity report that filters data for Closed-Won deals and then calculate Unique Account IDs.

Creating this metric in SQL is no different. In the spirit of breaking down a problem into its smallest parts, we need to:

  1. Isolate AccountIds that have a Closed-Won opportunity.
  2. Aggregate and COUNT the distinct AccountIds to determine  our “Customers” metric.

At this point, while writing our SQL for an Opportunity we can only perform 1 of the steps. Create a field that returns the AccountIDs for Opportunities that have reached a Closed-Won stage. 

Hints

  1. You have technically built a nearly identical field.
  2. Rather than a 0/1, you want to return the AccountId field.
  3. If the condition is false (meaning that the opportunity was not closed won), you should just return a NULL value.

Result

CASE
WHEN O.StageName = 'Closed Won' THEN O.AccountId
ELSE NULL END "Closed Won AccountIDs",

Or you can use

IFF  (O.StageName = 'Closed Won' ,O.AccountId, NULL)
"Closed Won AccountIDs2"

As a recap for this field, we have not calculated the final metric, but we have created a calculated field that we need to accurately measure “Customers”. We will visit this aggregate metric at the same time that we work on the Qualified Lead to Won Opportunity Ratio.

Final Opportunity Query Result

SELECT
O.Id,
O.CreatedDate,
O.ExpectedRevenue,
O.StageName,
O.AccountId,
O.CloseDate,
O.Type,
CASE
WHEN O.StageName = 'Closed Won' AND O.Type = 'New Customer' then 1
ELSE 0 END "New Customer Won Opportunities",
CASE
WHEN O.StageName = 'Closed Won' THEN O.AccountId
ELSE NULL END "Closed Won AccountIDs",
FROM SFDC.STAGE.OPPORTUNITY O;

8.11 Documentation and Delivery

In any data analysis, data science, or data engineering project, clear and thorough documentation is as crucial as the technical execution itself. Proper documentation ensures that your SQL queries, data models, and analytical insights are accessible, understandable, and usable by others, including your future self. This section will cover the importance of documenting SQL queries and their intended outcomes, along with best practices for effective documentation and delivery of your work.

Knowledge Sharing: Documentation acts as a knowledge base, allowing team members to learn from existing work. It helps new team members get up to speed and supports cross-functional collaboration.

Reproducibility and Consistency: Documenting the context, assumptions, and parameters used in your queries ensures that results can be reproduced and verified, which is critical for maintaining data integrity and consistency across reports and analyses.

Maintenance and Scalability: As data projects evolve, queries and data models need to be updated or optimized. Comprehensive documentation makes maintenance easier and supports the scalability of your data infrastructure.

Best Practices for Documenting SQL Queries

Commenting Within Queries: Use comments liberally within your SQL scripts to explain complex logic, the purpose of specific clauses, and the rationale behind choosing certain approaches over others. Comments can be inline or block comments, depending on the complexity of the explanation.

Query Descriptions: At the beginning of each SQL script or stored procedure, include a descriptive header that outlines the script’s purpose, author, creation date, and any modifications along with their dates and reasons.

Data and Metrics Dictionary: Maintain a data dictionary or schema documentation that describes each table and column used in your queries, including data types, constraints, and any relationships between tables. This is especially important in complex databases with numerous tables and fields.

Assumptions and Context: Document any assumptions made during the query development process, such as data cleanliness, uniqueness of key fields, or time zone considerations. Include context about the business logic or analytical goals driving the query’s design.

Result Interpretation: Guide how to interpret the results of your query, especially for complex analyses that might not be immediately intuitive. Include any caveats or limitations of the analysis.

Version Control: Use version control systems like GIT to manage your SQL scripts. This allows for tracking changes over time, understanding the evolution of your queries, and facilitating collaboration among team members.

External Documentation: For complex projects, supplement in-code comments with external documentation, such as Wiki pages, README files, or data analysis reports. These documents can provide broader context, detailed methodology, and analytical insights derived from the queries.

8.9 Pulling Leads and Opportunities into Your Joined Query

The queries from this lab have been re-organized logically before we combine them:

LEADS QUERY

SELECT  
L.Id, 
L.CreatedDate, 
L.ConvertedOpportunityId, 
L.ConvertedAccountId, 
L.Status, 
L.ConvertedDate, 

--CALCULATED FIELDS

'1' as "Lead Count", 
CASE WHEN L.Status = 'Open - Not Contacted' then 0 ELSE 1 END "Qualified Leads" 
FROM 
  SFDC.STAGE.LEAD L;

OPPORTUNITY QUERY

SELECT  
O.Id, 
O.CreatedDate, 
O.ExpectedRevenue, 
O.StageName, 
O.AccountId, 
O.CloseDate, 
O.Type, 
CASE WHEN O.StageName = 'Closed Won' 
AND O.Type = 'New Customer' then 1 ELSE 0 END "New Customer Won Opportunities", 
CASE WHEN O.StageName = 'Closed Won' THEN O.AccountId ELSE NULL END "Closed Won AccountIDs" 
FROM 
  SFDC.STAGE.OPPORTUNITY O;

FULL OUTER JOIN ON LEAD AND OPPORTUNITY

SELECT 
  L.Id as "Lead Id", 
  L.ConvertedAccountId as "Converted Lead Id", 
  O.Id as "Opportunity Id", 
  O.AccountId "AccountId" 
FROM 
  SFDC.STAGE.LEAD L FULL 
  OUTER JOIN SFDC.STAGE.OPPORTUNITY AS O ON L.ConvertedAccountId = O.AccountId;

Step 1: Attempt to merge and append the Lead fields into the full outer Join query.

Click to Reveal Results

Results

RESULT: Your result will appear and run as the following where we have added the calculated lead fields.

SELECT 
  L.Id as "Lead Id", 
  L.ConvertedAccountId as "Converted Lead Id", 
  L.CreatedDate, 
  L.ConvertedOpportunityId, 
  L.Status, 
  L.ConvertedDate, 
  O.Id as "Opportunity Id",   
  O.AccountId "AccountId",
  '1' as "Lead Count", 
  CASE WHEN L.Status = 'Open - Not Contacted' then 0 ELSE 1 END "Qualified Leads" 

FROM 
  SFDC.STAGE.LEAD L FULL 
  FULL OUTER JOIN SFDC.STAGE.OPPORTUNITY AS O ON L.ConvertedAccountId = O.AccountId;

Step 2: Attempt to merge the Opportunity fields into the full outer Join query.

Tips:

  1. You only need to include the Opportunity fields and calculated fields. Do not repeat the SELECT statement.
  2. You may have errors and that is Okay. We will troubleshoot and resolve them together!
Click to Reveal Results

Results

RESULT: Your result should look like the following where you have appended only the Opportunity fields

SELECT 
  L.Id as "Lead Id", 
  L.ConvertedAccountId as "Converted Lead Id", 
  L.CreatedDate, 
  L.ConvertedOpportunityId, 
  L.Status, 
  L.ConvertedDate, 
  '1' as "Lead Count", 
  CASE WHEN L.Status = 'Open - Not Contacted' then 0 ELSE 1 END "Qualified Leads", 
  O.Id as "Opportunity Id", 
  O.AccountId "AccountId", 
  O.CreatedDate, 
  O.ExpectedRevenue, 
  O.StageName, 
  O.CloseDate, 
  O.Type, 
  CASE WHEN O.StageName = 'Closed Won' 
  AND O.Type = 'New Customer' then 1 ELSE 0 END "New Customer Won Opportunities", 
  CASE WHEN O.StageName = 'Closed Won' THEN O.AccountId ELSE NULL END "Closed Won AccountIDs" 
FROM 
  SFDC.STAGE.LEAD L FULL 
  OUTER JOIN SFDC.STAGE.OPPORTUNITY AS O ON L.ConvertedAccountId = O.AccountId;

Resolving SQL Runtime Errors

In this particular example, and in many joined Salesforce tables you will run into conflicting column names. As a result, you will want to get into the habit of aliasing field names that are common across objects.

8.5 Create New Lead Fields with Expressions

To produce some of the metrics in our glossary we will use conditional expressions we learned about in chapter 6.

Lead Count

NameObject & FIeld OriginDate DimensionDescriptionDefinition
Lead CountLead.IdLead.CreatedDateAll LeadsAll leads loaded into Salesforce

Let’s create our first metric which takes us to our very first lab. If you wanted to represent the number ‘1’ for every record and call the field “Lead Count”, how would you format your SQL?

Result: 

'1' as "Lead Count"

Qualified Leads

NameObject & FIeld OriginDate DimensionDescriptionDefinition
Qualified Leads
Lead.LeadStatusLead.CreatedDateLeads WHERE LeadStatus NOT = ‘Open – Not Contacted’Qualified leads are those that are no longer Open-Not Contacted

For our second metric, we want to count “Qualified Leads”, which are represented as leads where the LeadStatus is not “Open – Not Contacted”. We will use the work you performed in lesson 6 to create a conditional expression. There are several approaches to solving this problem. 

Result: 

CASE
WHEN L.Status = 'Open - Not Contacted' then 
ELSE 1 END "Qualified Leads"

You can also use:

IFF (L.Status = 'Open - Not Contacted', 0 ,1) "Qualified Leads2"

Qualified Lead to Won Opportunity Ratio

NameObject & FIeld OriginDate DimensionDescriptionDefinition
Qualified Lead to Opportunity RatioLead.ConvertedOpportunityId
Opportunity.StageName
Lead.CreatedDateWon Opportunities / Total Qualified LeadsRatio of Won Opportunities divided by  Qualified Leads

Our third metric is a ratio which requires a numerator and denominator to successfully create the metric. We have already created the denominator, Total Qualified Leads. To create the numerator, “Won Opportunities” we need data from the Opportunity Object which will come later.

Additionally, the ratio we are looking for is a rollup/ aggregated value, so that will require some additional SQL that we will create together. We will come back and revisit this Lead to Won opportunity ratio metric later in this lesson!

Final Lead Query Result

SELECT 
L.Id,
L.CreatedDate,
L.ConvertedOpportunityId,
L.ConvertedAccountId,
L.Status,
L.ConvertedDate,
'1' as "Lead Count",
CASE
WHEN L.Status = 'Open - Not Contacted' then 0
ELSE 1 END "Qualified Leads",
FROM SFDC.STAGE.LEAD L;

Other query considerations based on previous lessons:

Alias

As you may have noticed, we only have alias fields that originate directly from the Leads table and not newly created fields.

WHERE (filter Data)

For the lead object, we have not defined any constraints that would filter and remove lead records from our result. In other words, we do not need a WHERE clause.

GROUP BY (group and aggregate)

At this stage, the lead record does not need to be aggregated. In other rods, we do not need a GROUP BY clause.

8.1 Structuring your SQL

this final lab, we are going to bring together many of the skills you have learned to build a full-funnel representation of your Salesforce organization, incorporating leads, accounts, and opportunities. In this full-funnel, you get to build a model of your leads and opportunities that you can slice and dice. We will continue to learn new concepts so that you have real, usable code that you can put to work immediately! 

Let’s quickly recap a few very important concepts….

Structuring your SQL

Together, we explored and learned the proper way to structure a SQL SELECT statement with the most important clauses, expressions, and keywords that will help you match your data needs with the right SQL code. With enough practice, you actually start to translate requirements into code.

Clause NameDescriptionStatus
WHEREFilters the results to include only those rows that meet the specified condition.CHECK
GROUP BYGroups rows that have the same values in specified columns, often used with aggregate functions like SUM, AVG,MAX,MINCHECK
ORDER BYUsed to sort the result set in ascending or descending orderCHECK
HAVINGFilters grouped data (only when using GROUP BYCHECK

AI is your Friend

As we covered early in this course, you should absolutely consider AI as a universal translator and learning aid. Knowing SQL fundamentals, Salesforce, and LLM basics will only accelerate your learning.

8.2 Final Lab Background

In our final lab, we will walk through a framework that I use to gather requirements and determine the path forward. Using the same principles you utilize to requirements for your Salesforce org, we can ensure we write SQL code that properly addresses a real-world problem. 

Background

A fairly common use case for reporting and analytics is a full-funnel model that helps marketing, sales, and revenue operations stakeholders complete the complete lead to closed won lifecycle and lifetime value for accounts (assumes accounts represent businesses or individual customers). Using the standard Salesforce data model where leads are converted to Accounts (with or without opportunities), we will capture the key metrics that matter most. We will create an SQL query that will help us model and explore the Salesforce funnel.

1. Scope, Outcome, and Impact Definition

Business Challenge: Typically I clarify the business challenge we are solving for before a single line of code is written. It is too easy to get lost in data, data wrangling, and coding. Having a clearly stated business challenge gives you a nice way to take a step back and ask “what problem are we solving?”

In this particular case, our business challenge is the lack of visibility for conversion rates from lead to opportunity, lifetime value in revenue, and geographic segmentation. For this course I have purposely left the business challenge vague but hopefully relatable for you.

Outcome: When you code a solution and deliver the data and/or analysis, what is the desired outcome?

In this particular case, we will consider the visibility of lifetime value by lead creates:

  • Impact: The impact of improved data-driven decision-making and enhanced understanding of the sales funnel’s performance should hopefully tie back to a specific metric or KPI. If not, it’s very hard to articulate the value of the SQL work you are performing and who benefits.
  • Business Pain: Not every project you work on has a groundbreaking revenue-generating and saving impact. Sometimes you are implementing code that improves an individual or group of users’ jobs easier. Do not underestimate the value of using data and creating data assets with SQL that can facilitate new knowledge. 

SQL mastery without the ability to articulate “WHY” is a waste of good skills, so hopefully through learning a practical application of SQL coding, you are armed for real success! Let’s move from our problem and put your new SQL skills to use!

2. Data Selection and Preparation

Object Selection For our lab, to obtain all of the data points we need to report our full-funnel, we are going to use LEAD, ACCOUNT, and OPPORTUNITY.

Field Selection – The fields we select likely mirror reports present in reports that you already use. There are several tactics that Salesforce admins and developers use to produce funnel metrics inside of Salesforce core. We will keep it simple for our lab and select the fields to produce 3 metrics:

1- Leads (from the lead object).
2 – Qualified Leads (from the lead object).
3 – Qualified Lead to Opportunity Ratio (requires lead and opportunity data).
4- Revenue (from the opportunity object).
5- Won Opportunities (from the opportunity object).
6- Customers (from the account object).

Data Exploration: Typically, when I build reports and analytics based on Salesforce data, I start with existing reporting and dashboard assets that business stakeholders already use. The self-service reporting is quite good in Salesforce, so it is not uncommon for many of the metrics within a funnel to exist already in some form inside of Salesforce.

Leads and Qualified Leads Example:

Opportunities, Revenue, and Customers Example:

5.2 JOIN Labs

Lab Joining 2 Data Sources with a Left Join

What if we wanted to return all leads with and without matched accounts? The following query with a LEFT JOIN will return all Leads that have and have not been converted to an Account.

SELECT L."Name" as "LeadName", A."Name" as "AccountName" 

FROM LEAD AS L 

LEFT JOIN ACCOUNT AS A 

ON L."ConvertedAccountId" = A."Id"

Lab Joining Two Data Sources with Full Outer Join

The following query with a LEFT JOIN will return all Leads that have and have not been converted to an Account and all Accounts that may not be associated with a Lead.

SELECT L."Name" as "LeadName", A."Name" as "AccountName" 

FROM LEAD AS L 

LEFT JOIN ACCOUNT AS A 

ON L."ConvertedAccountId" = A."Id"

After you perform a full outer join between leads and accounts, you may ask yourself three important questions:

  1. How many matches exist?
  2. How many leads exist without a converted account?
  3. How many accounts exist that were not converted from a lead?

We are not going to dig into these questions quite yet as we continue to learn and build up our working knowledge of expressions. We will refer back to this in our final lesson when we dive into a practical application of a full outer join.

Lab Joining Two Data Sources with Right Join

The following query with a RIGHT JOIN will return all Accounts that were converted from leads or exist without a lead, and only Leads that were converted to an Account. 

SELECT L."Name" as "LeadName", A."Name" as "AccountName" 

FROM LEAD AS L 

RIGHT JOIN ACCOUNT AS A 

ON L."ConvertedAccountId" = A."Id"

3.5 Create your Own Select Statement Lab

Select Individual Columns

To select individual columns, you can type the names of the columns wrapped in double quotes. It is important to note that databases vary in their syntax for selecting field names. Snowflake in particular requires double quotes when column names are case-sensitive or have spaces. For the purpose of our training, I will always use double quotes for consistency. Other databases like SQL Server allow you to use brackets [] or double quotes. Let’s explore this scenario together in lab 3:

Lab 3: Query Leads Table

  1. Name your tab Lab 3- Select Id and Name.
  2. Paste the following SQL SELECT statement into the query editor.

SELECT "ID","NAME" FROM LEAD;
  1. Include a semicolon at the end of your SQL Statement – You may notice that this example is terminated by a semi-colon. This is how you designate the end of your SQL statement. This is important particularly if you are including multiple SQL statements inside of one workbook. We will visit that later.
  2. Press run button in upper right to execute your query.

TIP: As a shortcut to clicking “run” you can press CTL + SHIFT + ENTER if you prefer keyboard shortcuts.

3. Add a couple more columns to your data set. As you will notice, Snowflake will attempt to provide fields available to.

Table, Schema, and Table Case Sensitivity

Similar to field names, your database, schema, or table names could be case-sensitive or have spaces. In these scenarios, they will need to be wrapped in double quotes like this. 

SELECT "ID","NAME" FROM “SFDC”.”STAGE”.”LEAD”;

As a general best practice I tend to keep all tables and schemas in all caps. Certainly, you shouldn’t have spaces, but it does happen. For our labs moving forward, I will continue to provide our database, schema, and table names without quotes.

Aligning Query and Analysis to Native Salesforce Reporting

If you have gotten to this point and are thinking to yourself, “that was a lot of work just to look at a table with Salesforce data,” you are correct. At this point, we have created the equivalent of a Salesforce Report vertical table report.

In fact, I created the exact same table in my Salesforce instance to get the same result.

As we dig deeper into SQL and start exploring the breadth and depth of functions and structures we will start to build a clear delineation between standard reporting in Salesforce and the advanced functions and structures using a data warehouse.

2.1 Hello World Lab

Hello World is the first lesson when starting any new programming language. The goal is to successfully use the programming language to output as a result. This demonstrates that you have the ability to input code and retrieve output.

SQL, or Structured Query Language, is a powerful tool for managing and querying data in databases. It uses a standardized syntax to perform various operations like retrieving, updating, inserting, and deleting data. In SQL, queries are commands that tell the database what to do. The basic structure of a query includes “SELECT”, which specifies what data to retrieve, and “FROM”, which identifies the source table. As we delve into SQL, understanding this structure will be key to effectively extracting and manipulating data.

Rather than exhausting you with syntax and dissecting a query, let’s jump in and use this Hello World exercise to quickly get your bearings and get our first win!

  1. Login to Snowflake.
  2. Upon login, you should see the “Worksheets” home screen.

I have numbered and documented the areas that we will work on for this course.

  1. Worksheets – Each worksheet will store your code. We will work on one worksheet per lab to keep our work together. 
  2. Data – Snowflake organizes your databases, schemas and tables within the Data tab. 
  3. Admin – This is where you manage your users, compute resources, billing, and other admin functions. We won’t spend much time worrying about Admin functions in Snowflake since we are operating within a trial.
  4. Create – The create button is in the upper right corner of the screen and exists as the starting point for creating all assets. When on the Worksheets tab, this button creates a new worksheet. When on the Data tab, it allows the creation of many different data assets.

3. Click the Create button, then select “SQL Worksheet”.

Snowflake will create a brand new worksheet, moving you from the home screen to an individual worksheet view. We will explore this view together.

  • Back Button – Though labeled “Worksheets”, the upper left button serves as a back button to get back to the home screen which displays a list of your worksheets and other Snowflake assets.
  • Worksheet Tab – The newly created worksheet by default is named with a time stamp. As you open more worksheets, they are added to the tab list. We will cover renaming and closing worksheets
  • Add Worksheet Button – This button allows you to create a new worksheet, which is an area where you can write and execute SQL queries.
  • Databases Tab – This section allows you to view and select from available databases within your Snowflake environment. Selecting a database will enable you to work with the objects (like tables and views) within that database.
  • Database Explorer – This area displays the hierarchy of database objects, such as schemas and the tables or views within them. You can navigate through your database structure here and perform actions like creating or modifying objects.
  • Current Worksheet Identifier – This indicates which worksheet you are currently working on. If multiple worksheets are open, they will appear as tabs in this area, and you can switch between them by clicking on their names.
  • User and Warehouse Information – This shows the current user account you’re logged in as (ACCOUNTADMIN in this case) and the compute warehouse that’s currently in use (COMPUTE_WH). The warehouse is the processing power for running your queries. We will cover errors you could see if this information is not filled in.
  • SQL Editor area – This is the space where you can write your SQL queries to be executed against the database. The text shown is an example of a simple SQL query.
  • Run Button – This button executes the SQL query or queries written in the currently active worksheet.

4. Execute Hello World Select Statement – Paste the following code into the SQL Editor area. I will explain these elements one by one.

SELECT 'Hello World' as "My First Field" 
  1. Click the “Run” button.

Your result should look like this:

Success! We just ran our first select statement. We didn’t query a database when we executed this code. We will perform that task together in our next lab. If you didn’t see the result and get an error, you can scroll down and explore some of the common errors and issues you may need to review with syntax, formatting, or Snowflake nuance.

5. Name your worksheet – To save our place, let’s name the worksheet. Double-click on the Worksheets tab name and type the value “Lab1 Hello World”.