Skip to main content

4.2.3 Translating Requirements to Code

Clarity on Translating Requirements to Code

This is where we pause for a minute on a step by step coding and take a step back to focus on an important practical application of what we are learning. Gathering requirements for SQL is similar to Salesforce where you need clear definitions and intent. It may require asking “why” multiple times to get clarification. 

In this scenario when provided guidance to count the number of states, it would seem logical to follow the same pattern. In a real world requirement gathering scenario, simply asking “why” would have revealed that I want to know how many states exist for each country. Creating new calculated fields on the fly is powerful but also requires a clear dictionary and good discipline to understand how the data will be used. 

Re-calculate State count to represent the number of unique States within each country. To calculate distinct values, we are adding a new “DISTINCT” keyword inside of the COUNT function, right before the field that you are calculating distinct values for. Counting distinct values is a big part of aggregating data. 

 COUNT (DISTINCT "STATE") "State Count"

 Our result tells us of the 19 leads, 14 of them reside in unique states.

8. Add a second dimension to group by “STATUS.” You will group by 2 dimensions. COUNTRY and then STATUS. Following some of the patterns you have learned, spend some time experimenting to see if you can add to round out your basic understanding of GROUP BY and AGGREGATION functions.


RESULT:

SELECT "COUNTRY",

 "STATUS",

 COUNT(*) as "Lead Count", 

 COUNT (DISTINCT "STATE") "State Count"

 FROM SFDC.STAGE.LEAD

 GROUP BY "COUNTRY", "STATUS";

How did you do? Hopefully, it looks something like the following result. The solution is simply add the “STATUS” value after the country and then add it to the SELECT statement too. That may not be obvious quite yet, but is a good way to learn how to use GROUP BY correctly.

I have provided a longer list of aggregation functions supported by Snowflake which may differ in some cases from other databases. There is a massive list to experiment with and learn from. 

FunctionBasic Definition
ANY_VALUEReturns any value from a group
AVGCalculates the average of a set of values
CORRComputes the correlation coefficient between two expressions
COUNTCounts the number of rows in a group
COUNT_IFCounts the number of rows in a group that meet a specific condition
COVAR_POPCalculates the population covariance
COVAR_SAMPCalculates the sample covariance
LISTAGGConcatenates values from a group into a single string
MAXFinds the maximum value in a set
MAX_BYReturns the value that corresponds to the maximum value in another set
MEDIANFinds the median value
MINFinds the minimum value in a set
MIN_BYReturns the value that corresponds to the minimum value in another set
MODEDetermines the mode (most frequently occurring value)
PERCENTILE_CONTCalculates a continuous percentile
PERCENTILE_DISCCalculates a discrete percentile
STDDEVComputes the sample standard deviation (alias for STDDEV_SAMP)
STDDEV_POPComputes the population standard deviation
STDDEV_SAMPComputes the sample standard deviation
SUMCalculates the sum of values
VAR_POPComputes the population variance
VAR_SAMPComputes the sample variance
VARIANCE_POPComputes the population variance (alias for VAR_POP)
VARIANCE, VARIANCE_SAMPComputes the sample variance (alias for VAR_SAMP)