JOINS in PostgreSQL

Welcome back, SQL-ites! Today, we’re going to take a significant leap forward in our journey of learning PostgreSQL.

A quick recap on where we’ve been recently:

In more broad terms, most of our PostgreSQL training so far has focused on how to best select and segment our data. However, there is much more to be accomplished in big data manipulation. Enter JOINS.

 

About JOINS & PostgreSQL

JOINS helps unlock the foundation of most SQL databases, that are known as relational databases. Specifically, when we’re able to triangulate multiple data sources together, combined with sorting, segmentation and some intuition, we can get some really powerful stuff.

So what does JOINS do? It allows you to relate data in one table to the data in other tables.

Think about some of our practice runs in PostgreSQL so far. We’ve looked at DVD rentals, customers, transactions, ID’s, and so on. What if we wanted to start linking in more detail customer names with ID’s, or transaction ID’s with film names, or all the above? Weaving data into an interactive story could allow us to see really cool things, such as which customers prefer certain genres, or perhaps unveil renting habits for a timely promotional offer to smooth out our top line revenue.

 

BASIC JOINS Syntax in PostgreSQL

As with many of our learning adventures, there is complexity and there are layers to our education. 🙂

First, it’s important to note that there are actually several types of JOINS, so this isn’t just an amorphous, blob-lump function. Here are some of the important kinds:

  1. INNER JOIN
  2. OUTER JOIN
  3. self-join

To start things out, we’re going to look at INNER JOIN first.

INNER JOINS Concepts for PostgreSQL

We’re going to jump headlong into this, at the expense of immediate clarity. Joy! Let’s say we have two tables which we’ll relate data between.

 

2017-09-10-001-JOINSA-Concept-Example-1

Above, we’ve pulled out some key points of interest between our tables, unique columns, and more importantly, a shared key (column) of customer_id between the two tables. We’ll explain more about keys in a minute. Ish.

We’ll use the shared column of (example) customer_id to further mash up the data. Let’s take a look at syntax now.

 

INNER JOINS Basic Syntax

If things aren’t 100% connected for you conceptually, I completely understand. Please do bear with me, as seeing some examples may help “close the loop”. Here’s what an INNER JOINS query might look like for a PostgerSQL usage:

 

SELECT A.pka, A.c1, B.pkb, B.c2

FROM A

INNER JOIN B ON A.pka = B.fka;

 

Let’s break this down a bit more.

The first line is extremely similar to what we’ve been doing all along. Calling in data. The subtle difference here is that we’re first including the respective table names. So, it’s tableA.column, tableB.column, tableB.column2.

Next, we specify the primary table we’re first selecting from, per usual. Then we specify the “donor” table that will be combined with the primary table via INNER JOIN. Further, we make this work by using the ON keyword which specifies the shared key columns. 🙂

 

More Explanation on the Inner Workings of JOINS (Pun Not Intended)

This has been a bit of a tricky concept for me to learn so far, so we’re going to walk through the nuts and bolts of JOINS as operated by PostgreSQL.

When JOINS is executed, PostgreSQL takes every row in our “Table A” and goes through Table B, looking for rows that match our specified condition of A.pka = B.fka. So combining our example and laymen’s terms, the JOINS function looks for common occurrences of the customer_id  so that additional, previously disparate data from separate tables may be awesomely fused together. 🙂

If a match is found between the two tables, it combines columns of both A and B rows into one row, and adds that newly combined row to the data set you get back.

 

Pardon the Interruption: PostgreSQL “life” hacks for JOINS

A couple of quick items that may prove useful to you as we get more familiar with joins.

  1. Sometimes your tables may share column names (Analytics data, anyone? Can I get an amen?), so you may need to use the syntax table_name.column_name for clarity.
  2. If you or someone you love has created excessively long table names, you can introduce a shorter alias (via the AS statement) for the table, and use accordingly.

 

Hey, I Thought We Were Going to Learn About INNER JOIN?

We are! And what’s better, we’re going to use a Venn diagram to explain it. 🙂 #DataScience.

Okay, the INNER JOIN PostgreSQL clause returns rows in Table A that have corresponding rows in Table B. Super scientific visualization below.

2017-09-11-002-JOINSA-Concept-Example-2

The overlapped area above are the results that will be returned to us.

 

INNER JOINS PostgreSQL Code Examples

Putting all the pieces together, here’s our first INNER JOINS code sample:

SELECT

customer.customer_id

first_name,

last_name,

email,

amount,

payment_date

FROM

customer

INNER JOIN payment ON payment.customer_id = customer.customer_id;

 

And below, what we get back from pgAdmin.

2017-09-11-003-JOINSA-INNER-JOINS-First-Example

 

Okay, a few notes. We went out of our way to say that overlapping column names need table names for distinction. Here’s what we didn’t say, you don’t need to specify table names for unique columns. Above, you can see that we’ve nicely meshed together customer ID with both customer information with payment information. Awesome!

I know this is a bit of a rough stop- but I’m really trying to work on getting more sleep these days. So it’s 12:30am, and I’ve got to head to bed. We’ll be back with updates on this post and more progress forward. Cheers!

Update – 9/16/2017 – Add’l INNER JOINS Examples

Welcome back again, SQL-ites! We’re back with a weekend update from our first look at INNER JOINS in PostgreSQL. Above, we left off with a basic example of INNER JOINS. To quickly recap, we might think of JOINS as basic data composition. Often in marketing and business useful data may begin in silos, thus, combination into useful form must occur.

Our next example of INNER JOINS is a slight modification to the first full example. We have the same basin INNER JOINS syntax, but have added ORDER BY customer.customer_id at the end, to clean the presentation of our data a bit.

SELECT customer.customer_id

first_name,

last_name,

email,

amount,

payment_date

FROM customer

INNER JOIN payment ON payment.customer_id = customer.customer_id

ORDER BY customer.customer_id ASC;

2017-09-16-004-JOINSA-INNER-JOINS-Second-Example

 

Using WHERE with INNER JOINS

Let’s continue to add modifications to our basic INNER JOINS statement. Below, we harness the power of both combination and selection to specify joined results, but for a specific customer ID. A use case for this might involve isolating or routing specific data after it’s been joined, for reporting purposes.

SELECT customer.customer_id

first_name,

last_name,

email,

amount,

payment_date

FROM customer

INNER JOIN payment ON payment.customer_id = customer.customer_id

WHERE customer.customer_id = 2;

2017-09-16-005-JOINSA-INNER-JOINS-Third-Example

DSC_0050 Zach Doty AS PostgreSQL Statement Cover Photo

JOINS Foundations: The AS PostgreSQL Statement

Intro to JOINS: the AS PostgreSQL Statement

What’s up SQL people? We’re back, and better than ever, in our foray to learn PostgreSQL. Since we’ve completed some intermediate skills challenges & learned GROUP BY, it’s time to examine JOINS.

Before we examine JOINS, there’s a key foundation piece we must cover first: the AS statement in PostgreSQL. Let’s jump in.

 

About the AS Statement

The AS statement in PostgreSQL enables us to rename sections of a table or table columns with a temporary alias, or almost a variable, for manipulation.

It’s a simple statement, so let’s see it in action.

 

1. Basic AS Statement Example

Our most basic example is a basic query where perhaps a column wasn’t named to our liking. Consider the following.

SELECT rental_rate AS film_cost

FROM film

LIMIT 10;

2017-08-30-001-AS-Statement-Syntax-Example-1

Great for an introductory example, but not inherently useful. Read on as we apply the AS statement more deeply.

2. Semi-Intermediate AS Statement Example

Let’s provide an example that’s a bit more engaged. Example, if we use aggregate functions, the column output doesn’t have a clean name attached to it. But no longer! The AS statement allows us to have the summation output in a GROUP BY statement to something we’ll recognize.

 

SELECT customer_id, SUM(amount) AS customer_ltv

FROM payment

GROUP BY customer_id

ORDER BY customer_ltv DESC

LIMIT 7;

2017-08-30-002-AS-Statement-Syntax-aggregate-example-2

 

This is something more useful for intermediate PostgreSQL work!

 

Wrap Up

We aren’t spending much further time here since this is a simple application and the JOINS statement is the function we’re truly after. If you’re just joining this series, check out our home page on how to learn PostgreSQL.

DSC_0006 Zach Doty Intermediate GROUP BY SQL Skills Challenge Cover Photo

Intermediate SQL Skills Challenge: GROUP BY

Hey there SQL-ites! Wow, it’s been awhile since I’ve last posted. Work has been crazy busy again, and just life in general. Sure feels good to be back, learning again! I’m daringly dropping straight back into my learnings from where we left off…in April!? Crazy.

Anyway, let’s get back to brass tacks. Before life and work got really crazy for me, we covered:

Another quick recap note, we’ve been using the famous DVD rental training database for our work. On to the good stuff.

 

GROUP BY SQL Skills Challenge #1

Let’s say it’s time for quarterly reviews, who doesn’t love evaluations? ¯_(ツ)_/¯ Implement your knowledge of GROUP BY against the following problem:

We need to figure out how much revenue our employees have realized from DVD rentals, and how many transactions each handled.

 

GROUP BY SQL Skills Answer #1

Let’s talk through the problem and dissect it before presenting code. “A problem well-stated is a problem-half solved” – paraphrase of some smart person.

  1. We’re talking about revenue, so we’ll need to be dealing with the payment table.
  2. We’re evaluating employees (staff), SUM of revenue, and COUNT of transactions.
  3. If we’re aggregating this data, we’re GROUPing BY employee.
  4. We’re also ORDERing the employees BY who handled the most transactions.

That said, here’s the code:

SELECT staff_id, SUM(amount), COUNT(amount)
FROM payment
GROUP BY staff_id
ORDER BY COUNT(amount) DESC;

…with our results!

 

GROUP BY SQL Skills Challenge #2

Well done on your challenge! Here’s the second:

It’s time to do inventory, flashbacks of retail and restaurant wonder for all of us. 🙂

In the name of forecasting and planning, we need to figure out the average replacement cost of our movies, by rating.

 

GROUP BY SQL Skills Answer #2

Ok, let’s walk through our problem.

  1. We need to use the film database here, since we’re gathering information on the movies.
  2. We’re GROUPing our films BY rating
  3. We’re using an aggregate function to determine the average replacement cost of each rating.

Drum roll, here’s a winning code snippet:

SELECT rating, AVG(replacement_cost)
FROM film
GROUP BY rating
ORDER BY AVG(replacement_cost) DESC;

With the output:

 

Are there more challenges we should be covering? Yes. However, I’m trying to do better about getting more sleep these days. Unlike past SQL articles, it’s still (barely) before midnight. So we’ll take a quick breather, possibly update this article, but definitely keep moving forward. Cheers!

 

Update- 8/20/2017 —

GROUP BY SQL Skills Challenge #3!

Alright SQL-ites. After getting some rest, I’ve regrouped a few days later to cover the last challenge:

From our database, we want to get the customer ID’s of the top 5 customers, by money spent, so we can send them a thank you for their business!

 

GROUP BY SQL Skills Answer #3

Let’s diagnose the problem.

  1. If we’re gathering revenue information, we’ll need to use the payment table.
  2. If we’re getting the top spending customers, we’ll need to GROUP all transactions BY customer ID
  3. To see the top 5 paying customers, we’ll want to ORDER the results BY the SUM of payment amount.

Considering the above, here’s our code:

SELECT customer_id, SUM(amount)
FROM payment
GROUP BY customer_id
ORDER BY SUM(amount) DESC
LIMIT 5;

DSC_0045 Zach Doty Cover Photo Group By SQL Statement Function

GROUP BY SQL Statement

Introducing the GROUP BY SQL Statement in PostgreSQL

‘Ello SQL geeks! Welcome back to our SQL learning journey. We left off with a beginner SQL skills challenge and the aggregate SQL functions: MIN, MAX, AVG and SUM. Today we’re looking at the GROUP BY statement. We’ll learn about this function in PostgreSQL and walk through usage of this handy SQL statement.

 

About the GROUP BY SQL Statement/Clause

From my simple understanding, GROUP BY functions like a hybrid of the following:

  • SELECT DISTINCT keyword (If used without an aggregate function like SUM), and,
  • An Excel pivot table, rolling up aggregate figures (Count, Sum, Average, etc.) into unique rows

If you’re familiar with Excel Pivot tables, then you’ll recognize here the power of this function.

 

Let’s take a look at some examples to clarify.

 

First Look at Using the GROUP BY Function

To better illustrate the power of GROUP BY, we’ll first show its usage without aggregate functions. Consider the following:

If we query the address table of our sample database with a generic SELECT * FROM address; we get back an atrocious 605 rows of data. Aggregate and useless!

2017-04-27-001-GROUB-BY-Generic-SELECT-Query-Pitfall

In contrast, if we call the GROUP BY function, we’ll get back a cleaner output, with fewer rows – only unique values returned. While this is an incremental improvement to analyzing data, there’s much left to be desired.

2017-04-27-002-GROUP-BY-Without-Aggregate-Function

 

What’s missing here? How about that pivot table-esque functionality? This is where the power of using GROUP BY with aggregate functions gets awesome.

Using the GROUP BY SQL Statement with Aggregate Functions

As with most analysis, a single data point or data series rarely holds significant insight value on its own. Let’s drive home that point by leveraging the GROUP BY statement with the SUM aggregate function. Below, we compare the ratings of films in our sample database in aggregate by replacement cost. Perhaps this could serve in-store strategies for loss prevention.

SELECT rating, SUM(replacement_cost)
FROM film
GROUP BY rating;

2017-04-27-003-GROUP-BY-SUM-Aggregate-Function

If we extend this functionality to more real world examples, we could use the following for GROUP BY:

  • Grouping page-level / URL data to roll up clickstream analytics data
  • Large scale analysis of CRM data for customer segmentation analysis
  • Analyzing returns for financial data

The list could (and I’m sure it does) go on.

Let’s take this one step further and reduce potential future workload, by building sort functionality into our query. Below, we add a line to get most expensive ratings to least.

SELECT rating, SUM(replacement_cost)
FROM film
GROUP BY rating
ORDER BY SUM(replacement_cost) DESC;

2017-04-27-004-GROUP-BY-SUM-ORDER-BY-SUM

By the looks of this, no need to guard Land Before Time 8. 🙂

Extending our lesson: you can use the COUNT, AVG, and other aggregate functions to analyze as desired.

 

Wrap Up

Alright, this was a relatively gentle introduction into more advanced SQL functions. GROUP BY is a rather critical function, so in our next article, we’ll be doing yet another skills challenge. Joy!

Feel free to catch up on our other articles that help you learn PostgreSQL. Also, check out some how-to’s on developing Amazon Alexa skills, and a new series on getting started with Machine Learning. As always, please share with your colleagues and share thoughts in the comments below. Cheers.

DSC_0007 Zach Doty Cover Photo for Beginner PostgreSQL Skills Challenge

Beginner SQL Skills Challenge!

Howdy, SQL geeks. Hope this post finds you swell!

Over the past few months, we’ve gained a ton of ground in learning SQL, or at least I have. 🙂

Let’s take a moment to:

  1. Test our knowledge of SQL skills learned thus far
  2. Start seeing SQL queries less as statements of code, and more as real world business challenges

In this article, we’ll have a recap plus three sections:

  • Recap of the training database we’ve been working with
  • General statements of each business problem
  • Hints and thoughts about how to approach each problem
  • Solutions to each problem

 

Recap: Our Training Database

Our training database is a best/old faithful. We’ve been using the surprisingly popular DVD rental database in a .tar file for our practice database.

Contained within this databases are various tables with fictitious information, including: customer information, film production information, business/pricing information and so forth.

In our challenges, we’ll execute various SQL queries to extract pieces of insight for business tasks. It’s assumed in this article that you’ve installed PostgreSQL via pgAdmin and have followed this article series so far, using the DVD rental training database.

Without further ado, let’s begin.

 

The SQL Challenges

Alright, here we go:

  1. How rentals were returned after July 17, 2005?
  2. How many actors have a last name that starts with the letter A?
  3. How many unique districts are represented in the customer database?
  4. Can you return the actual list of districts from challenge #3?
  5.  How many films have a rating of R and a replacement cost between $5 and $15?
  6. How many films have the word Truman somewhere in the title?

 

How to Approach the Challenges

Right, then. In this section, we’ll add some color commentary (read: hints) to our challenges. This should help you understand the mechanics of the solution, while ensuring you can’t see the answers all in one screen. 😉

 

Challenge 1: Rentals Returned After July 17, 2005

As with all challenges, a problem well stated is half (or more) solved. So let’s look at the high levels of the ask, and work our way down. We need information on rentals, so this means we’ll probably be querying the rentals table.

We would want to first examine the rentals table in a concise manner by doing:

SELECT * FROM rental LIMIT 5;

Once you’ve surveyed the table, we really only need one column returned (pun not intended) and we only want the sum figure of returns, where (HINT!!) the return date was after (think a logical operator here) July 17, 2005.

 

Challenge 2: Actors That Have a Last Name Beginning with “A”

Like our first challenge, let’s work from the “top down”. We need actor information, so querying the actor table would be a great place to start. Similar to last time, we need a count of values matching a condition. The difference versus challenge #1  is we need to find match a pattern like or such as an actor’s last name that begins with the letter A.

 

Challenge 3: Number of Unique Districts in the Customer Database

The title and description could cause some confusion here. You may need to do some basic SELECT * FROM table_name LIMIT X; queries to make sure you’ve got the right table. Once you do, we’re looking for an amount of distinct values in the database. Order of operations matters.

 

Challenge 4: Returning the Actual Lists of Districts from Challenge #3

Not much to hint at here – getting challenge #3 is the key here. You’ll really only be simplifying the correct query in challenge #3 to get the correct answer here.

 

Challenge 5: Cheap, (Mildly) Naughty Films

This one might be the longest query yet in this challenge. So we’re looking up film information, thus should know which table to query. We’re returning a value where a certain rating must be returned, and (HINT) we need to layer in one more lens of qualification. That lens dictates we specify a range between (cough, hint!) two values.

 

Challenge 6: Where in the World are Films Containing “Truman”?

This challenge is more of a recency test than retention of older concepts. You’ll need to employ pattern matching again for this business challenge/query to find titles that have some match like Truman in the title.

 

 

Challenge Solutions

Is that your final answer? Below are the queries, with screenshots of what I did.

 

Solution 1: Rentals Returned After July 17, 2005

SELECT COUNT(return_date) FROM rental
WHERE return_date > ‘2005-07-17’;

2017-04-15-001-Challenge-1-Sol-Count-From-Where

 

Solution 2: Actors That Have a Last Name Beginning with “A”

SELECT COUNT(*) FROM actor
WHERE last_name LIKE ‘A%’;

2017-04-15-002-Challenge-2-Sol-Count-From-Where-Like

 

Solution 3: Number of Unique Districts in the Customer Database

SELECT COUNT(DISTINCT(district)) FROM address;

2017-04-15-003-Challenge-3-Sol-Count-Distinct

 

Solution 4: Returning the Actual Lists of Districts from Challenge #3

SELECT DISTINCT(district) FROM address;

2017-04-15-004-Challenge-4-Sol-Select-Distinct

 

Solution 5: Cheap, (Mildly) Naughty Films

SELECT COUNT(*) FROM film
WHERE rating = ‘R’
AND replacement_cost BETWEEN 5 AND 15;

2017-04-15-005-Challenge-4-Sol-Count-From-Where-Between-And

 

Solution 6: Where in the World are Films Containing “Truman”?

SELECT COUNT(*) FROM film
WHERE title LIKE ‘%Truman%’;

2017-04-15-006-Challenge-4-Sol-Count-Where-Like-Wildcard

 

Wrap Up

Well done for completing these challenges! You shall indeed pass. 🙂

Soon, we’ll be covering aggregate SQL functions, such as MIN, MAX, AVG and SUM.

If you’re just joining us, here’s a running list of our articles so to date (4/16/2017):

 

DSC_0064 Zach Doty Unsupervised Machine Learning Intro Cover Photo

Unsupervised Learning Introduction: Machine Learning Essentials

Howdy, machine learning students! Today we’re going to introduce the concept of unsupervised machine learning algorithms.

Quick Recap: Supervised Learning

Before we jump in, let’s quickly recap our last article introducing supervised machine learning algorithms. This will give us the appropriate context for unsupervised learning.

In supervised machine learning problems, we supply pre-labeled data to the algorithm. By supplying data that’s already correctly labeled, we ask the algorithm to further predict (regression) or label (classification) new data.

2017-04-11-004-Multiple-Input-Classication-Machine Learning

 

Unsupervised Machine Learning = Unlabeled Data

The most immediate and prominent difference  for unsupervised learning is the data. Above, we gave the algorithm “a boost” by supplying the intended “right” answers in the data. Below, in an unsupervised machine learning problem, there are no right answers…yet.

2017-04-15-001M-Unsupervised-Machine-Learning-Problem-Data

We’ve supplied the algorithm with data in the problem, but it’s provided without labels or “answers”. We are mandating that the algorithm discover structure and infer patterns/labels on its own. We could also compare the above example to a clustering problem.

So in unsupervised learning, we supply a large amount of unlabeled data, without explicitly identified form or structure. We ask the algorithm to come up with ideas of structure and segmentation on its own.

Some additional applications of unsupervised learning could include:

  • Market segmentation of massive transaction data
  • Large scale social networking data
  • Astronomical data analysis
  • Large scale market data
  • Mass audio/voice analysis
  • Large scale gene clustering

 

Wrap Up

That was a bit of a quick one! The challenge with some these technical subject matter areas is sometimes we have limited room to run before going off into the technical weeds. This is one of those areas. Next, we’ll be covering some key concepts in the areas of machine learning model representation, cost function and parameter learning. Don’t worry too much about those yet, we’ll take it step by step. 🙂
As always, feel free to follow my other journeys of learning PostgreSQL, learning how to develop Amazon Alexa Skills, learning how to get started in algorithmic trading, JavaScript for beginners…and more to come soon! Cheers.

DSC_0024 Zach Doty Cover Photo for What is Machine Learning

What is Machine Learning?

Hello there, fellow Machine Learning (ML) students! Welcome back to our crash course in starting machine learning from an absolute beginner’s perspective.

In our previous article, we covered an introduction to Machine Learning, answering several key questions:

  • Where is machine learning used in our lives?
  • Where did machine learning come from?
  • Where is machine learning headed?

Forging ahead in our learning journey, we’ll introduce some definitions of machine learning and look at the major types of machine learning applications.

 

Machine Learning (ML), A Casual Definition by Arthur Samuel

Our first definition, teased in the last article, follows:

Machine learning is the practice of giving computers the ability to learn without being explicitly programmed to do so.

 

More on Arthur Samuel & Why His Definition on ML Matters

If you’re like me, you might not have heard of Arthur Samuel. Who is he, and why does his opinion matter in the fields of artificial intelligence and machine learning?

Arthur Samuel was a pioneer in artificial intelligence and computer gaming fields. In 1959, he coined the term “machine learning” as a founding father in the field. That’s why he’s important! Let’s also look at a more formal / scientific definition.

 

A More Formal Machine Learning Definition

Tom Mitchell, of Carnegie Mellon, offers a definition with more structure.

  • A well defined learning problem follows
  • E * T = P
    • Note: His definition does not include mathematical operators. I’m taking a large liberty to insert them myself. ¯\_(ツ)_/¯
  • Experience (E) placed against Task (T) is measured by Performance (P)

2017-04-06-001-Machine-Learning-Definition-ETP-Framework

Here’s a further example:

Example: playing Go.

E = the experience of playing many games of Go.

T = the task of playing and winning Go.

P = the probability that the program will win the next game.

 

Major Categories of Machine Learning Algorithms

If you judge by press coverage of ML as I have, it appears to be a nebulous field. (In all fairness, it may still be.) However, there is structure we can take in learning ML. There are two types of machine learning algorithms:

  • Supervised learning algorithms
  • Unsupervised learning algorithms

There are a couple of other prominent types of machine learning algorithms as well: reinforcement learning and recommender systems.

 

 Wrap-Up

Congratulations, we’ve cleared a very gentle introduction to machine learning, and it’s novice/high level definitions. I look forward to learning more with you, dear reader! Our next articles will cover a bit more detail about the two major ML algorithm types: supervised learning, and unsupervised learning. Until then, look after each other.

DSC_0030 copy Zach Doty Cover Photo For SQL Statement SELECT WHERE

SQL Query Fundamentals: SELECT WHERE

Welcome back, travelers! The journey continues in learning SQL. In case you missed it, the past couple of posts were learning to use SELECT DISTINCT and Restoring an SQL database with table schema only.

Today we’re going to learn about using the SELECT clause with the WHERE statement for essential SQL queries.

A Quick Recap, or, Why SELECT WHERE is Important

As previously mentioned, we’ve covered a range of beginning SQL topics. Mainly, we’ve learned about using the SELECT statement to query all (*) or query specific columns of data from a table. Because we’ve been working with a pared down table with only a few hundred rows, it’s not a problem in this “academic” setting to return all the rows. But what if we’re working in a larger database? A recent keyword research dataset 75,000 rows long comes to mind. (Though I would imagine that too, would be a small dataset in the grand scheme of things, but I digress.)

When we start working with larger databases, granularity will become vitally important. This is where (pun not intended) the SELECT WHERE statement comes in.

Sample Syntax of the SELECT WHERE Statement

With proof of concept in mind, let’s jump into it headlong. Below is a syntax example, demonstrating what a SELECT WHERE SQL query might look like.

SELECT column_1,column_2

                FROM table_name

                WHERE condition1;

The SELECT statement is old news to you adventurers that have been following along. The WHERE portion of this query will be the power in this article.

More About the SELECT WHERE SQL Statement

The WHERE clause appears right after the FROM clause of the SELECT statement. The conditions listed within the WHERE clause are used to filter the rows returned from the SELECT statement. Because we’re working in pgAdmin / PostgreSQL, we’ll have available standard operators to construct the conditions. Better still, some (or most) of the operators we’ll look at are fairly universal, so these operators should work in MySQL, Microsoft SQL, Amazon Redshift, etc.

List of SELECT WHERE Operators

Below is a list of common SELECT WHERE operators. Again, most of these should be fairly universal, regardless of the SQL database management program you’re using.

OperatorDescription
=Equal to
Greater than
Less than
>=Greater than or equal to
<=Less than or equal to
<> or !=Not equal to
ANDLogical operator AND
ORLogical operator OR

 

Plugging this back into SQL statements, we’ll be using the operators on the left to filter down and return only specific rows in our queries.

Sample SELECT WHERE Statements

Let’s  cover some guiding examples that will help us apply the SELECT WHERE operators. To start, we’ll kill two birds with one stone: jogging the memory by utilizing a previous query and exploring the table we’ll be querying. Below, we’ll be a bit naughty by calling on all columns from the table.

SELECT * FROM customer;

2017-01-30-004-SELECT-WHERE-CustomerID-OR

Above, we’ll see our results and some candidate columns to query the heck out of. Let’s keep moving.

 

Example 1: The Basic SELECT WHERE Statement with 1 Condition, Returning Only Customers with a Certain Name

Okay, let’s say that we want to only return customers of a certain name, say, “James”. The SELECT WHERE statement will help us make quick work of this database need.

SELECT last_name,first_name

                FROM customer

                WHERE first_name = ‘James’;

If all goes correctly, we should only get back a customer with the first name James. ‘Ello James!

2017-01-30-002-SELECT-WHERE-First-Name-Equals

Side note 1: you don’t have to return the columns you’re filtering against. For example, we could return the email column and still filter by name.

 

Example 2: A SELECT WHERE Statement Using 2 Conditions, Returning Customers with a Certain First AND Last Name

Perhaps you’ll want to do something more targeted with your data. I know this is a narrow and frankly creepy example of calling out one name, but think maybe of a City/State or Source/Medium pairing? Anyway, with the sample dataset we have, below we use the AND logical operator to combine two conditions into one query.

SELECT last_name,first_name

                FROM customer

                WHERE first_name = ‘Jared’ AND last_name = ‘Ely’;

If executed properly against this particular sample dataset, we should be returned only the values for one fictional Mr. Jared Ely.

2017-01-30-003-SELECT-WHERE-First-Name-and-Last-Name

Quick side note 2: that we should have mentioned sooner: we’re using single quotes here because the values we’re querying against are string values. As such, the single quotes help us match format, et cetera.

 

Example 3: Another SELECT WHERE Statement Using 2 Conditions, Returning Customer ID’s where payment was in certain dollar amount ranges

Let’s say we are trying to identify a range of customers in our database. In this third example, we want to query Customer ID’s and names in a certain range. We could think of it as a feeble attempt to get our first customers or our most recent customers for a special email flight. Below, we’ll exercise the OR operator to accomplish our desired output.

SELECT customer_id,first_name,last_name

                FROM customer

                WHERE customer_id <= 2 OR customer_id >=20 ;

2017-01-30-004-SELECT-WHERE-CustomerID-OR

Understanding the Subtle Differences Between AND / OR

Quick note about differences between the AND / OR operators. If you’re trying to filter data from two different columns, then AND is your filter. If you’re trying to get distinct values within a single column, then the OR operator will be best suited for the job.

 

Wrap-Up

What a world of possibilities we’ve opened! I found myself needing to slow down and pay more attention to detail in this area of learning. The nuances of selecting certain columns but filtering by others when practicing threw me for a loop once or twice.

Thanks for joining, in the next article, I’ll be covering some introductory material around the COUNT function. In the meantime, check  out our running list of posts on how to learn SQL.

Exploring marketing career paths by climbing a mountain in Colorado

4 Simple Marketing Career Paths Your Professor Never Told You About

Several months ago, I sat in a warm conference room in the University of Houston Bauer business school. The wind gusted outside;  professors and a couple of brave students shuffled papers on a conference table between presentations. I was privileged to be an advisory board member and mentor at this Enactus UH event.

Unexpectedly (to me), an adjacent student leaned over and asked about marketing career paths. What an intelligent question! I realized determining a marketing career path can be utterly confusing as a student. There are slick advertising agencies, PR boutiques, massive corporations with large silos and vague job titles all around. What on earth is an account executive and what makes it different from an account planner, account manager or anything else?

In that moment, it was my pleasure to share with this student how to evaluate marketing career paths. Some years ago, I was strangely fortunate that by providence I’ve enjoyed such an easy, clear path so far in my young career. Reflecting on that exchange, there are likely thousands of students seeking clarity and guidance on marketing careers.

Below is a simple method that shows you marketing careers at a glance, with detailed explanations below.

marketing-advertising-jobs-career-starts

tldr;

In marketing, you make major two choices that determine your career path: where you work and what you do. You’ll either work at an agency or a client and you’ll either focus on branding or direct response. Obviously, there’s room for a lot of creativity and nuance – but a large portion of marketing careers fit this framework.

Where You Work: Agency

We’ll start here, because I’m most familiar with this area. An agency is a 3rd party that a brands such as Adidas or GM hires to help with marketing. Agencies exist because they provide expertise and impartiality that brands can’t get in-house.

Marketing and advertising agencies are a great starting point to master tactical execution. You’ll often be on the sharp end of the spear: pulling reports, writing copy and auditing campaigns. It’s often a fast-paced environment. The offices have bright lights, open concept floor plans, and quirky perks.

Working in an agency is fun! However, it’s imperative you do some homework before joining an agency. I’ve been very fortunate to work at agencies that offer great benefits and pay. My agency experiences have been nothing short of phenomenal. Regrettably, not all marketing agencies are that way. Some agencies  can be volatile if clients leave or have long/tough hours in uncertain conditions. That said, it’s your privilege of choice!

Where You Work: Client

If you’re not the agency, you’re likely the client. If you’re on the “client side”, you’re likely working in-house at a brand like Hilton or Neiman Marcus.

Client side marketing can vary a lot, depending on the marketing maturity of your organization. In some organizations, you may be managing agencies, vendors and strategy. In less-developed or smaller companies, marketing may be closer to sales support.

In general, client-side jobs are more stable and evenly paced than agencies. Deadlines are usually more relaxed and the pay is generally higher compared to agencies. However, marketing can be vulnerable in a corporate setting. If your organization falls on hard financial times, accounting and marketing are the first functions to be cut.

Much like agencies, there are great brands and tough situations. Do your homework!

What You Do: Branding

Now you know where you can work in marketing and advertising. The aspiring marketer’s next major decision is what you do.

In marketing, you either focus on consumer actions or consumer perceptions. The art and science of branding focuses on consumer perceptions. Branding is a fascinating discipline. You constantly fight battles in the minds of brands and consumers. You must appear first, as the best.

Branding jobs often have “creative” or “strategist” somewhere in the title; a keen sense of strategy is key to your success. To learn more about branding, you should read the classic, “Positioning” by Al Ries and Jack Trout. A caveat to branding-centric positions: you may lack experience in tactical marketing execution at the expense of creative pursuits. If you’re a strategy maven (which is rare), you can survive without concrete tactical knowledge. If you’re weak in strategy, it can be tough to advance your career in this field.

What You Do: Direct Response

The other major marketing avenue is direct response. Instead of focusing on consumer perceptions (branding), you work with consumer actions. Direct response is self-evident, you work to produce an action from a consumer. If you envision a purchase funnel, direct response typically handles the end: making a sale, strengthening a relationship.

Direct response jobs often have “analyst”, “planner” or “specialist” in the title. Working in direct response is fantastic to build practical, tactical experience that can be used in many places & many ways. You’ll find skills such as auditing, research and optimization highly marketable.

The upside of starting in direct response is that marketing is inherently self-promotional: the barriers to entry are low. However, if you don’t chase strategy skills, you can be quickly stuck as a tactician and encounter long-term difficulty advancing your career.

Finding Your Path

The good news: marketing career paths are an open playing field! The bad news: marketing career paths are an open playing field. Hopefully this post brings some definition to a broad field, while pointing out benefits and risks.

DSC_0034 Zach Doty Cover Photo How to Learn SEO in Five Minutes

How to Explain SEO in 5 Minutes

Today, you’re going to learn how to explain SEO to anyone in five minutes! Is it a comprehensive, end-all, be-all example? No. Will it help someone quickly understand the basics of a complex practice? Yes. Let’s jump in. This is, after all, a five minute guide!

Starting Point: SEO is not a black box.

For context, I often hear that SEO is a “black box” and it’s anyone’s guess as to how you rank in search engines. I’m always happy to hear that! It means we have a chance to explain what SEO is really about. Google isn’t a black box. Rather, its algorithms and search engines are so complex at a large scale, it’s difficult to comprehend on a daily basis.
With introductions out of the way, here’s how you do it:
A Job Application
Consider SEO like a job application. You have a resume, references, you deal with a recruiter and a hiring manager. And you want a great career!
In SEO, you have a website, backlinks, the search engine and the user. You want to be found and loved!

Your Resume
Your resume is your website. You control the content and tailor the experience to each opportunity. You balance who you are and what you do.
Your website balances your brand story and products or services. You target keywords to pages of your site, just like your jobs are targeted to certain functions and skill sets.
Which resume would you choose to interview for a Marketing Director?

A. “I’m Leroy, I’m awesome. I’m really cool, I enjoy long walks on the beach and like Ethiopian food.”

B. “Kevin is a seasoned marketing professional who has managed and implemented effective marketing promgrams.”

By the same token, which website would you do business with? One with “high quality products and synergistic solutions”? Or a site that speaks in plain, compelling English?
Your References
Your references are your backlinks, or 3rd party websites linking to your own. You can influence these endorsements and relationships, but not necessarily control them. On both sides of this comparison, quality trumps quantity.

Which of the following references holds the most weight for a candidate?

A. “Little Michael is my favorite nephew, he eats a lot at Thanksgiving, you should hire him.”

B. “Jake paid me some money to recommend him. I guess he’s cool.”

C. “Ryan took our marketing department to the next level with smart strategy. He works well with the team implementing our campaigns.”

Ryan is the best choice. A bunch of low quality or fake recommendations will do a job seeker little good if not harm. By the same token, it’s better to have a few excellent backlinks than a bunch of spammy or fake endorsements.
The ultimate goal
When you’re a job seeker, interviews are great. But you really want the career! Likewise, it’s great to rank well in the search engines, but you want new, loyal customers that convert on your website.

Above all else, optimize your website for your users. The search engine results will follow.

Where to next?

That’s how to explain and understand SEO in five minutes! Obviously, there’s much, much more to learn. However, this will get you on the right path. Dive deeper into SEO with a fantastic Beginner’s Guide from Moz.