Skip to main content

3.2 Salesforce / Snowflake Integrations

Salesforce / Snowflake Integrations

With recent advancements in Salesforce CRM Analytics and Data Cloud, we have Salesforce native data pipelines into Snowflake. I have covered this topic extensively for Salesforce Ben having written two articles that highlight the latest methods for Salesforce native Snowflake connectivity. The skills you are learning in these labs will allow you to understand what is possible with data once it leaves Salesforce. 

The skills you are learning now will help you provide guidance when the time comes to pull data out of any relational database into Salesforce. 

2.4 Debugging with AI

Since 2023, and we live in a world where AI chatbots can serve as SQL co-pilots and QA checkers!. In this lab, I copy pasted code that was invalid because of the double quote formatting. To debug this in ChatGPT, you can copy and paste your SQL and then copy/paste the error. I have included some prompts that you can use to help you debug code.

In mid 2024 Snowflake also introduced their own native Copilot intended to help you write and review SQL having context of your tables and fields. As you learn SQL, you will be able to quickly enhance and super charge your work using co-pilot. Using LLM generated code without understanding SQL is particularly dangerous if you are responsible and accountable for delivering guidance on data.

Let’s explore the same problematic code in ChatGPT. If you don’t use ChatGPT, you can use any LLM like Google Bard and get the same result. As you can see, my prompt clarifies that I am asking for help specifically for Snowflake. 

Important note when using LLMs for query debugging:

  1. Make sure you include the error returned by Snowflake when you create your prompt. This will help provide context.
  1. Make sure when you copy code and and aerror into a LLM chat bot, you provide proper context in your prompt. I have provided a good prompt template below;

You are a Snowflake SQL expert. You will help guide me through troubleshooting and refining my code. I am expecting the SQL below to: [EXPLAIN THE OUTCOME OR OUTPUT]

[INSERT YOUR SQL CODE]

When I run this code, I get this error:

[INSERT YOUR ERROR HERE]

2.3 Anatomy of a Database

Now that you have written your first SQL statement we are going to take one step back before we charge forward to understand the anatomy of a relational database. The information you learn here is universal to any modern database.

Without getting into the detailed nuances and technology architecture of Snowflake and a Relational Database Management System (DDBMS), the concepts you will learn in this course for Data Query Language (DQL) will translate to databases like Oracle, Microsoft SQL, MySQL, PostgreSQL, and others.

Navigating your Database

  1. Database
  • Definition: A collection of schemas, tables, and other objects
  • Purpose: Organizes and stores data in a structured way
  • Salesforce Analogy: Comparable to a Salesforce Org, which contains multiple objects, fields, and records
  1. Schema
  • Definition: A logical container within a database that holds tables, views, and other objects.
  • Purpose: Helps organize and manage data at a more granular level within the database and allows for fine grained control including but not limited to access and security.
  • Salesforce Analogy: Comparable to a group of related objects (like Account and Contact objects) within a Salesforce Org
  1. Table
  • Definition: A structured set of data held in rows and columns
  • Purpose: Stores actual data entries
  • Salesforce Analogy: Comparable to an individual Salesforce object (like the Account or Contact object) that holds records (data entries)

Working with Fields

Within your database table, you will spend a lot of time referencing and selecting fields as you build queries. We won’t get into the depths of database definition language SQL, meta data management or or query optimization. 

  1. Field (Column)
  • Definition: A single piece of data stored in a table, representing a specific attribute of the data. Fields are the columns in a table.
  • Purpose: Holds individual pieces of data for each record (row) in the table. Each field has a data type (e.g., integer, varchar, date) that defines the kind of data it can store.
  • Salesforce Analogy: Comparable to a field in a Salesforce object (like the FirstName or LastName field in the Account object).

Field Configuration and Metadata

Data Types

  • Definition: Specifies the type of data that can be stored in a field. Common data types include integer, varchar (variable character), date, boolean, etc.
  • Purpose: Ensures that the data stored in a field is of the correct type, which helps maintain data integrity and supports efficient data processing.
  • Example:
    • Integer: Used for whole numbers.
    • Varchar: Used for text data.
    • Date: Used for dates.
    • Boolean: Used for true/false values.
  • Salesforce Analogy: Similar to the data types of Salesforce fields, such as text, number, date, and checkbox.

Primary Keys

  • Definition: A field (or combination of fields) that uniquely identifies each record in a table.
  • Purpose: Ensures that each record in the table is unique and can be efficiently retrieved.
  • Example: A CustomerID field in the Customers table.
  • Salesforce Analogy: Similar to the Id field in Salesforce objects, which uniquely identifies each record.

Foreign Key

  • Definition: A field in one table that links to the primary key of another table, establishing a relationship between the two tables.
  • Purpose: Enforces referential integrity and allows for the creation of relationships between tables.
  • Example: An OrderID field in the OrderDetails table that links to the OrderID in the Orders table.
  • Salesforce Analogy: Similar to a lookup or master-detail relationship field in Salesforce, which links one object to another.

Snowflake vs traditional RBDMS

All of these concepts you have learned about in this section are good for context and understanding of the anatomy of a relational database. Snowflake itself originally was not designed to operate as a RBDMS, but rather a cloud data platform that inherits many of the common properties of a RBDMS in its innovative cloud artchitecture. 

Concepts like database, schema, and table help organize and secure data. With this basic understanding of the anatomy of your database, we can return to creating and exploring SQL data query language to access and manipulate data!

2.2 Basic SQL Statements – DQL

Basics SQL Statement

Now, we will dig into the components of a SELECT statement so you understand the building blocks of the code you just ran and how we will build upon Hello World to start querying our database.

SELECT

Most of the SQL you will learn early on starts with a command that instructs the database what we are going to do. All of our queries in this course will generally start with a “SELECT” command. The role of the SELECT command is to fetch data from a database. This is one of several “Data Manipulation Language” DML commands used to insert, update, delete, and select data from a database.

Literal

Instead of specifying a column name from a table, we entered a simple text string also known as a literal: ‘Hello World’. This demonstrates that a SELECT statement can also be used to output fixed, specified values, not just data from tables. Literals are commonly used any time you need to declare a value that is known and has meaning. For example when you filter or write a conditional statement. We will cover all of these use cases together.

Literals are declared using single quotes: Half the battle in learning any new programming language is how single and double quotes are used.

Column Name

When we get to the next lab we will upload data to Snowflake and SELECT specific columns. Column names are encapsulated in double-quotes in Snowflake anytime you require case sensitivity or have special characters like spaces in your column names. For example, if we wanted to select an Id and Name field in 1 query, it would look like this. We will get to create a query like this in our next lab.

SELECT “Id”, “Name”

Column names are declared in double quotes when you need case sensitivity. If you do not include double quotes your column names will be represented in all caps.

“Id” is interpreted by Snowflake as Id

Id is interpreted by Snowflake as ID

Alias

In our SELECT statement, we have our ‘Hello World’ literal value but we don’t have a column name specified to hold this value. To provide a clear, user-friendly name we specified an Alias in double quotes to name the column.  This is an important concept that you will use in future labs where we may change an existing column name.

 as “My First Field”

It is important to note that the word “as” is not required in SQL syntax. It does help for readability as you start learning. For consistency, I will include the word “as” anytime I implement an alias. In the same way you have API names and field Labels in Salesforce, aliases provide a clean and easy-to-read label for a column.

Practical Use Case for an Alias

Typically, when working with Salesforce data in Snowflake, you are using API names (which is preferred). Aliases are a great way to deliver labels or without underscores and trailing “__c”.  Here is an example

SELECT "my_customfield__c" as "My Custom Field" FROM "Leads"

NOTE: When you review SQL written by other developers you may notice the “as” keyword is not included. This is an optional keyword for readability of code but not necessary. For the sake of consistency and learning, we will always include the “as” keyword.

FROM

The FROM clause indicates the database object from which you are pulling data. When you use a SELECT statement, you’re usually interested in getting data from a particular table or view in your database. The FROM clause is where you specify that table. Syntax: In a simple query, the FROM clause:

SELECT column1, column2 FROM table_name;

In the next lab we will get to execute the following SQL and view the results:

SELECT "Id", "Name" FROM "Leads"

Debugging Code

A lot of your adventures in SQL will be debugging and troubleshooting code that does not execute. 

Character Formatting When Copy Pasting Into Snowflake

Luckily in Hello World, because we are not querying a database, the only thing that could go wrong is copy and paste into Snowflake. These are tough to pick out to the naked eye. As you can see, when I copy pasted from Google Sheets, there was a problem with our double quotes. The fix was re-typing the double quotes.

2.1 Hello World Lab

Hello World is the first lesson when starting any new programming language. The goal is to successfully use the programming language to output as a result. This demonstrates that you have the ability to input code and retrieve output.

SQL, or Structured Query Language, is a powerful tool for managing and querying data in databases. It uses a standardized syntax to perform various operations like retrieving, updating, inserting, and deleting data. In SQL, queries are commands that tell the database what to do. The basic structure of a query includes “SELECT”, which specifies what data to retrieve, and “FROM”, which identifies the source table. As we delve into SQL, understanding this structure will be key to effectively extracting and manipulating data.

Rather than exhausting you with syntax and dissecting a query, let’s jump in and use this Hello World exercise to quickly get your bearings and get our first win!

  1. Login to Snowflake.
  2. Upon login, you should see the “Worksheets” home screen.

I have numbered and documented the areas that we will work on for this course.

  1. Worksheets – Each worksheet will store your code. We will work on one worksheet per lab to keep our work together. 
  2. Data – Snowflake organizes your databases, schemas and tables within the Data tab. 
  3. Admin – This is where you manage your users, compute resources, billing, and other admin functions. We won’t spend much time worrying about Admin functions in Snowflake since we are operating within a trial.
  4. Create – The create button is in the upper right corner of the screen and exists as the starting point for creating all assets. When on the Worksheets tab, this button creates a new worksheet. When on the Data tab, it allows the creation of many different data assets.

3. Click the Create button, then select “SQL Worksheet”.

Snowflake will create a brand new worksheet, moving you from the home screen to an individual worksheet view. We will explore this view together.

  • Back Button – Though labeled “Worksheets”, the upper left button serves as a back button to get back to the home screen which displays a list of your worksheets and other Snowflake assets.
  • Worksheet Tab – The newly created worksheet by default is named with a time stamp. As you open more worksheets, they are added to the tab list. We will cover renaming and closing worksheets
  • Add Worksheet Button – This button allows you to create a new worksheet, which is an area where you can write and execute SQL queries.
  • Databases Tab – This section allows you to view and select from available databases within your Snowflake environment. Selecting a database will enable you to work with the objects (like tables and views) within that database.
  • Database Explorer – This area displays the hierarchy of database objects, such as schemas and the tables or views within them. You can navigate through your database structure here and perform actions like creating or modifying objects.
  • Current Worksheet Identifier – This indicates which worksheet you are currently working on. If multiple worksheets are open, they will appear as tabs in this area, and you can switch between them by clicking on their names.
  • User and Warehouse Information – This shows the current user account you’re logged in as (ACCOUNTADMIN in this case) and the compute warehouse that’s currently in use (COMPUTE_WH). The warehouse is the processing power for running your queries. We will cover errors you could see if this information is not filled in.
  • SQL Editor area – This is the space where you can write your SQL queries to be executed against the database. The text shown is an example of a simple SQL query.
  • Run Button – This button executes the SQL query or queries written in the currently active worksheet.

4. Execute Hello World Select Statement – Paste the following code into the SQL Editor area. I will explain these elements one by one.

SELECT 'Hello World' as "My First Field" 
  1. Click the “Run” button.

Your result should look like this:

Success! We just ran our first select statement. We didn’t query a database when we executed this code. We will perform that task together in our next lab. If you didn’t see the result and get an error, you can scroll down and explore some of the common errors and issues you may need to review with syntax, formatting, or Snowflake nuance.

5. Name your worksheet – To save our place, let’s name the worksheet. Double-click on the Worksheets tab name and type the value “Lab1 Hello World”.

1.4 Instructor Bio

Below is my official bio, but more importantly my social links to connect directly with me. I want to hear about your experience, background and how we can make this course the best it can be to help you along your journey. Don’t hesitate to reach at any point.

Formal Bio

Ryan brings over 20 years of experience as a Salesforce customer. Throughout these two decades, Ryan has donned various hats in analytics, working as a hands-on practitioner, consultant, entrepreneur, author, and community leader. After 15 years in enterprise software, Ryan shifted gears from being a vendor to a customer. As the VP of Data Analytics for a mid-sized financial services company, he implemented and executed his analytics strategy playbook over a span of 4 years. With thousands of hours dedicated to building a successful analytics competency center using Salesforce, Snowflake and Azure, Ryan has channeled his expertise, tools, and recent AI advancements into his latest endeavor, DataTools Pro.

Social and contact links to connect with me

1.3 Expectations and Outcomes

This course will focus on Data Query Language, DQL for short. The aim of introducing SQL is to focus on query and analysis use cases for selecting data from your database. This is a great starting point and gives you coverage and common ground when collaborating with analytics and data teams. 

We will leverage Snowflake to import data, but will not get into SQL commands for creating and administering a database. A general understanding of Data Query Language will open the door for you to explore other areas like Data Definition, Data Manipulation, and administrative functions like Data Control, and Transaction Control. I will explain various commands at a high level to give you a good foundation.

The SQL concepts and code you will learn in this course are universal. It will translate to most Relational Database Management Systems (RDBMS) databases including, but not limited to, Microsoft SQL server, MySQL, POSTRESQL, and others. I will also reference Salesforce SOQL from time to time. We are going to use Snowflake as the cloud database of choice due to: 

  1. Synergies with Salesforce and native integration. 
  2. Minimal friction to start building. 
  3. It being a single, web-based experience to learn from.

1.1 Why SQL Proficiency Matters in the Salesforce Ecosystem

  • Salesforce has been investing heavily in Data Cloud, meaning more customer-related data will flow in and out of Salesforce. Having a background and understanding of how those processes work with an SQL foundation will help you understand the needs and requirements of your data team.
  • Salesforce’s “Bring your own Datalake” BYOL will allow external data sources like Snowflake and Databricks to natively enhance Salesforce as a platform. Upskilling and understanding SQL will ensure you can make yourself available to support these initiatives. 
  • Knowing SQL will allow you to deliver direct support to Marketing Cloud engineers working to extract and format the right data into marketing journeys.
  • SQL is a gateway into the wide-open world of data and analytics outside of Salesforce. Snowflake in particular and Databricks are the two leading data platforms for analytics. 

1.2 Marketability of SQL in your Skills Arsenal

  • Employees with SQL skills earn an average of 10% more than those without, according to PayScale.com.
  • The job market for SQL skills is projected to grow 10% between 2020 and 2030, much faster than the average for all occupations. (Source: Bureau of Labor Statistics)
  • 72% of hiring managers in the United States consider SQL skills to be “very important” or “essential” when hiring for data-related roles. (Source: Indeed 2023 Hiring Report)

Perspective

I have taken an “analytics first” approach to this course where we will learn SQL based on common real-world use cases where we need data out of Salesforce to facilitate reporting and analytics. We are taking this step together and making connections to the power of Salesforce reporting functions. To draw comparisons and differences between Salesforce native features and SQL, I hope to explain how and why various SQL functions exist in the context of a data or analytics professional using Salesforce data outside of Salesforce.