Posts

DSC_0511 Zach Doty Cover Photo Univariate Linear Regression

Univariate Linear Regression Concepts

Howdy, machine learning compatriots! Welcome back to our foray into getting started with machine learning. Previously, we covered some core machine learning concepts, namely supervised machine learning algorithms and unsupervised / deep learning. (For the full series to date, here’s our Machine Learning for Beginners page.)

 

Today we’re learning the concepts behind supervised machine learning algorithms. Specifically, we examine univariate (one variable) linear regression. Univariate linear regression is the beginner’s playpen in supervised machine learning problems. We endeavor to understand the “footwork” behind the flashy name, without going too far into the linear algebra weeds.

 

Quick Recap: Supervised Machine Learning Problems

If you’re just dropping into the series, we’ll quickly set today’s stage. Univariate linear regression falls under the category of regression algorithms, withing supervised learning machine learning problems.

 

2017-04-30-001-Machine-Learning-Algorithm-Types

  • Supervised learning: we provide the algorithm with pre-cleaned, pre-labeled data. The algorithm learns off the data we provide to classify or predict new data.
  • Regression: making a line of best fit.

 

When we first covered supervised machine learning concepts, regression was shown to make a line of best fit from existing data, so we could predict new data points. Below, we first used the example of an SEO team predicting how many unique linking domains a page would need to achieve a certain rank. (A supervised learning problem, using a regression algorithm for future predictions.)

 

2017-04-11-002-Regression-Problem-Linear-Quadratic-Comp

Important note: our graphic above is similar to linear, but is not quite, linear regression. Details, details. At any rate, this should take us in nicely to examining the inner workings of univariate linear regression.

 

A High Level Look at the Regression Problem Process

If I’m being brutally honest, the process of translating machine learning education to public-facing blog posts has been my toughest effort to date. In other words, I try to make my posts easy to follow, like dummy notes I’m taking as I learn.

That being said, 2 weeks into a machine learning course, and the content has already gone off the wheels, deep into linear algebra and so forth. So, instead of going into the weeds for publication, I’m trying to keep it snackable (buzzword bingo, drink!) and down to earth.

Let’s settle in slowly on the regression problem process. Also illustrated below, we need a few key steps:

  1. A cleaned training data set with correct labels
  2. A program (such as Matlab or Octave) with functionality and access to an appropriate univariate linear regression algorithm
  3. A hypothesis and prediction of new values

2017-04-30-002-Univariate-Machine-Learning-Regression-Process

 

Let’s peel back a layer and go slightly deeper. Since cleaning and correctly labeling training data is largely dependent on you & your domain, we’re skipping that step. ¯_(ツ)_/¯

 

Instead, let’s look closer at the algorithm & hypothesis portions! Our first stop is something called the cost function.

 

The Cost Function in Linear Regression Learning Problems: Squared Error

Before we jump into cost function, let’s turn over a new leaf in visual examples. Instead of our SEO example, let’s look at a problem that could be more linear-friendly. Below, let’s assume we have some data on a customer’s lifetime value plotted against the number of marketing touchpoints they’ve interacted with.

2017-04-30-003a-Univariate-Linear-Regression-Sample-Data-Viz

 

Okay, with the housekeeping complete, let’s remember our goal for linear regression: find the line of best fit. 

Let’s also tie this back to the real world. Perhaps we’re a marketing director or VP of marketing needing to convey the ideal number of marketing touchpoints to the CMO and CEO. Doing so could help guide budgeting, channel mix, and planning questions.

How do we find a line of best fit? Through linear algebra and programming, we can objectively determine the best fit by testing hypotheses and measure each hypothesis line against the actual data points for closeness of fit.

2017-04-30-004a-Univariate-Linear-Regression-Cost-Function-Hypothesis-A

Being frank, the material up to this point is pretty humdrum. However, when we start making hypotheses such as the above, things get interesting. The program “makes a guess” as to the line of best fit, perhaps like the illustration above. I’m no “eggspert”, but that doesn’t look like a great line of fit.

 

But have no fear dear reader, math/science comes to the rescue. The next portion of the algorithm calculates the distance (cost function / squared error) from the training data to the predicted line of best fit in a process called squared error function. When you plot the hypothesis against the squared error sum, you may get a distribution something like the below.

2017-04-30-005-Univariate-Linear-Regression-Cost-Function-Plot

Bare with me. Let’s say we plotted:

  1. Our illustrated hypothesis (teal plus sign)
  2. Other attempted hypotheses (tan x’s), and,
  3. The best fit hypothesis (green outlined star)

This renders a convex parabolic distribution. To get the line of best fit, we want to get as low on the X axis as possible. (Known as the global minimum.) The further magic in machine learning is how we move from a lame hypothesis (teal plus sign) to solution (green outlined star). Now, meet a technique called gradient descent. Sidebar: if we’re being more mathematical and technical about it, this really plots to a 3D conic distribution, but the above explanation should suffice for now if we’re not getting bogged down in the math.

 

Parameter Learning & Gradient Descent

Gradient descent is the iterative mathematical process of working our way down the squared error plot from a lousy hypothesis to a line of best fit. Again, we’re not delving into calculations and derivatives – there’s a TON of math that goes behind this material.

Gradient descent systematically tests increments of hypotheses against a specified learning rate. The learning rate is essentially the magnitude or speed with which you which try to move along the convex function down to zero.

2017-04-30-006-Univariate-Linear-Regression-Simplified-Gradient-Descent

 

Wrap Up

Did I mention this is one of the toughest posts for me to date? The other contender is my DIY Alexa Raspberry Pi article. It’s now 3:20 a.m. on a Saturday night/Sunday morning as I type this conclusion. (Insert horror emoji.)

So, if we were to break down all of the above into a short bulleted list:

  1. Univariate linear regression takes sample data to make a line of best fit
  2. “Best fit” is objectively measured by a squared error function, or the summed distances of the hypothesis line from the actual data points
  3. The hypothesis and squared error function plot roughly a convex parabolic graph
  4. Gradient descent is an algorithm that systematically reduces the squared error hypothesis, guided in part by the learning rate
  5. The gradient descent iteratively seeks the global minimum on the convex function, AKA the line of best fit
  6. The line of best fit is determined, (and Teh Lurd of Teh Rings finishes on your second monitor to Herb Alpert’s Spanish Flea.)

 

Next up, we’ll be installing some machine learning software (Matlab & Octave) and diving into multivariate regression. Look after each other.

DSC_0045 Zach Doty Cover Photo Group By SQL Statement Function

GROUP BY SQL Statement

Introducing the GROUP BY SQL Statement in PostgreSQL

‘Ello SQL geeks! Welcome back to our SQL learning journey. We left off with a beginner SQL skills challenge and the aggregate SQL functions: MIN, MAX, AVG and SUM. Today we’re looking at the GROUP BY statement. We’ll learn about this function in PostgreSQL and walk through usage of this handy SQL statement.

 

About the GROUP BY SQL Statement/Clause

From my simple understanding, GROUP BY functions like a hybrid of the following:

  • SELECT DISTINCT keyword (If used without an aggregate function like SUM), and,
  • An Excel pivot table, rolling up aggregate figures (Count, Sum, Average, etc.) into unique rows

If you’re familiar with Excel Pivot tables, then you’ll recognize here the power of this function.

 

Let’s take a look at some examples to clarify.

 

First Look at Using the GROUP BY Function

To better illustrate the power of GROUP BY, we’ll first show its usage without aggregate functions. Consider the following:

If we query the address table of our sample database with a generic SELECT * FROM address; we get back an atrocious 605 rows of data. Aggregate and useless!

2017-04-27-001-GROUB-BY-Generic-SELECT-Query-Pitfall

In contrast, if we call the GROUP BY function, we’ll get back a cleaner output, with fewer rows – only unique values returned. While this is an incremental improvement to analyzing data, there’s much left to be desired.

2017-04-27-002-GROUP-BY-Without-Aggregate-Function

 

What’s missing here? How about that pivot table-esque functionality? This is where the power of using GROUP BY with aggregate functions gets awesome.

Using the GROUP BY SQL Statement with Aggregate Functions

As with most analysis, a single data point or data series rarely holds significant insight value on its own. Let’s drive home that point by leveraging the GROUP BY statement with the SUM aggregate function. Below, we compare the ratings of films in our sample database in aggregate by replacement cost. Perhaps this could serve in-store strategies for loss prevention.

SELECT rating, SUM(replacement_cost)
FROM film
GROUP BY rating;

2017-04-27-003-GROUP-BY-SUM-Aggregate-Function

If we extend this functionality to more real world examples, we could use the following for GROUP BY:

  • Grouping page-level / URL data to roll up clickstream analytics data
  • Large scale analysis of CRM data for customer segmentation analysis
  • Analyzing returns for financial data

The list could (and I’m sure it does) go on.

Let’s take this one step further and reduce potential future workload, by building sort functionality into our query. Below, we add a line to get most expensive ratings to least.

SELECT rating, SUM(replacement_cost)
FROM film
GROUP BY rating
ORDER BY SUM(replacement_cost) DESC;

2017-04-27-004-GROUP-BY-SUM-ORDER-BY-SUM

By the looks of this, no need to guard Land Before Time 8. 🙂

Extending our lesson: you can use the COUNT, AVG, and other aggregate functions to analyze as desired.

 

Wrap Up

Alright, this was a relatively gentle introduction into more advanced SQL functions. GROUP BY is a rather critical function, so in our next article, we’ll be doing yet another skills challenge. Joy!

Feel free to catch up on our other articles that help you learn PostgreSQL. Also, check out some how-to’s on developing Amazon Alexa skills, and a new series on getting started with Machine Learning. As always, please share with your colleagues and share thoughts in the comments below. Cheers.

DSC_0007 Zach Doty Cover Photo for Beginner PostgreSQL Skills Challenge

Beginner SQL Skills Challenge!

Howdy, SQL geeks. Hope this post finds you swell!

Over the past few months, we’ve gained a ton of ground in learning SQL, or at least I have. 🙂

Let’s take a moment to:

  1. Test our knowledge of SQL skills learned thus far
  2. Start seeing SQL queries less as statements of code, and more as real world business challenges

In this article, we’ll have a recap plus three sections:

  • Recap of the training database we’ve been working with
  • General statements of each business problem
  • Hints and thoughts about how to approach each problem
  • Solutions to each problem

 

Recap: Our Training Database

Our training database is a best/old faithful. We’ve been using the surprisingly popular DVD rental database in a .tar file for our practice database.

Contained within this databases are various tables with fictitious information, including: customer information, film production information, business/pricing information and so forth.

In our challenges, we’ll execute various SQL queries to extract pieces of insight for business tasks. It’s assumed in this article that you’ve installed PostgreSQL via pgAdmin and have followed this article series so far, using the DVD rental training database.

Without further ado, let’s begin.

 

The SQL Challenges

Alright, here we go:

  1. How rentals were returned after July 17, 2005?
  2. How many actors have a last name that starts with the letter A?
  3. How many unique districts are represented in the customer database?
  4. Can you return the actual list of districts from challenge #3?
  5.  How many films have a rating of R and a replacement cost between $5 and $15?
  6. How many films have the word Truman somewhere in the title?

 

How to Approach the Challenges

Right, then. In this section, we’ll add some color commentary (read: hints) to our challenges. This should help you understand the mechanics of the solution, while ensuring you can’t see the answers all in one screen. 😉

 

Challenge 1: Rentals Returned After July 17, 2005

As with all challenges, a problem well stated is half (or more) solved. So let’s look at the high levels of the ask, and work our way down. We need information on rentals, so this means we’ll probably be querying the rentals table.

We would want to first examine the rentals table in a concise manner by doing:

SELECT * FROM rental LIMIT 5;

Once you’ve surveyed the table, we really only need one column returned (pun not intended) and we only want the sum figure of returns, where (HINT!!) the return date was after (think a logical operator here) July 17, 2005.

 

Challenge 2: Actors That Have a Last Name Beginning with “A”

Like our first challenge, let’s work from the “top down”. We need actor information, so querying the actor table would be a great place to start. Similar to last time, we need a count of values matching a condition. The difference versus challenge #1  is we need to find match a pattern like or such as an actor’s last name that begins with the letter A.

 

Challenge 3: Number of Unique Districts in the Customer Database

The title and description could cause some confusion here. You may need to do some basic SELECT * FROM table_name LIMIT X; queries to make sure you’ve got the right table. Once you do, we’re looking for an amount of distinct values in the database. Order of operations matters.

 

Challenge 4: Returning the Actual Lists of Districts from Challenge #3

Not much to hint at here – getting challenge #3 is the key here. You’ll really only be simplifying the correct query in challenge #3 to get the correct answer here.

 

Challenge 5: Cheap, (Mildly) Naughty Films

This one might be the longest query yet in this challenge. So we’re looking up film information, thus should know which table to query. We’re returning a value where a certain rating must be returned, and (HINT) we need to layer in one more lens of qualification. That lens dictates we specify a range between (cough, hint!) two values.

 

Challenge 6: Where in the World are Films Containing “Truman”?

This challenge is more of a recency test than retention of older concepts. You’ll need to employ pattern matching again for this business challenge/query to find titles that have some match like Truman in the title.

 

 

Challenge Solutions

Is that your final answer? Below are the queries, with screenshots of what I did.

 

Solution 1: Rentals Returned After July 17, 2005

SELECT COUNT(return_date) FROM rental
WHERE return_date > ‘2005-07-17’;

2017-04-15-001-Challenge-1-Sol-Count-From-Where

 

Solution 2: Actors That Have a Last Name Beginning with “A”

SELECT COUNT(*) FROM actor
WHERE last_name LIKE ‘A%’;

2017-04-15-002-Challenge-2-Sol-Count-From-Where-Like

 

Solution 3: Number of Unique Districts in the Customer Database

SELECT COUNT(DISTINCT(district)) FROM address;

2017-04-15-003-Challenge-3-Sol-Count-Distinct

 

Solution 4: Returning the Actual Lists of Districts from Challenge #3

SELECT DISTINCT(district) FROM address;

2017-04-15-004-Challenge-4-Sol-Select-Distinct

 

Solution 5: Cheap, (Mildly) Naughty Films

SELECT COUNT(*) FROM film
WHERE rating = ‘R’
AND replacement_cost BETWEEN 5 AND 15;

2017-04-15-005-Challenge-4-Sol-Count-From-Where-Between-And

 

Solution 6: Where in the World are Films Containing “Truman”?

SELECT COUNT(*) FROM film
WHERE title LIKE ‘%Truman%’;

2017-04-15-006-Challenge-4-Sol-Count-Where-Like-Wildcard

 

Wrap Up

Well done for completing these challenges! You shall indeed pass. 🙂

Soon, we’ll be covering aggregate SQL functions, such as MIN, MAX, AVG and SUM.

If you’re just joining us, here’s a running list of our articles so to date (4/16/2017):

 

Supervised Learning & Its Types: Machine Learning Essentials

Welcome back, machine learning geeks! Let’s delve deeper into our journey of mastering machine learning. In the previous article, we looked at both informal and technical definitions of machine learning.

 

We also looked at the two major types of machine learning algorithms, A) supervised machine learning algorithms, and B) unsupervised machine learning algorithms. We also mentioned reinforcement learning and recommender systems, but won’t spend as much time there.

 

Let’s jump in!

 

Introduction to Supervised Learning

Supervised machine learning algorithms are used when you:

  1. Have a set of known, correctly labeled data
  2. Are looking to predict a continuous value output

 

Let’s visualize by looking at a digital marketing example.

 

Perhaps we are digital marketers looking to forecast or predict how much time and effort we’ll need to spend on outreach and content promotion for a particular webpage and target ranking.

 

Say we’ve gathered some data about website pages with:

  • Their rank for a given keyword
  • The amount of unique linking domains pointing to each page

 

Such a distribution of data might look like the below. It demonstrates a trend, but right now, we don’t have a single linear function that will “connect all the dots”.

 

2017-04-09-001-Supervised-Machine-Learning-Regression-Example

 

This is a great example for the first major subdivision of supervised machine learning algorithms:

 

Regression Learning Problem

Off the cuff, there are a couple of different ways in which we might try to solve this problem. Both solutions involve using the “labeled” data to predict a line of best fit, which, on the whole, minimizes the distance between the line and all the points. If we have a simple slope, predictions could be precarious at best, and misrepresentative on the other end of the spectrum.

 

We could also instruct our programs to fit a quadratic equation to the data (read: not a straight line.) In our slightly altered example here, the difference could be significant.

 

2017-04-11-002-Regression-Problem-Linear-Quadratic-Comp

 

At this point in time, we won’t focus on whether we should pick a linear or quadratic line for the regression output. However, it is worth noting that the two different methods could yield widely varying results.

 

Say we wanted to get a webpage ranking in position 5 for this given study, a linear example would have us preparing to secure links from ~180 unique domains. If we decided on the quadratic solution, we could be looking at significantly less effort, perhaps ~125 unique linking domains?

 

Classification Learning Problem

Insert smooth segue here and please forgive my lazy writing at this time. 🙂

 

The next major subdivision of supervised machine learning algorithms is known as a classification problem. Let’s use another example.

 

We are analyzing a large user study of an Amazon Alexa Skill in development. Perhaps we are classifying a particular interaction with the skill by success or failure (1 or 0), and plotted against the measured spoken word count for the given interaction.

 

Visualized, this data might look like the below.

 

2017-04-11-003-Supervised-Classification-Machine-Learning

 

In this example with (shockingly 🙂 ) clean data, we might want to guide development efforts in providing the best sample phrases/interactions for the skill. Perhaps, we would want to measure the probability an interaction four (4) spoken words long will be successful. This is known as a classification learning problem.

 

Above, we examined only one factor in determining a probability. However, we aren’t limited to examining just one parameter.

 

Let’s consider the following, perhaps we are an e-commerce retailer or digital business. A frustration for many marketers is the “one and done” (self explanatory) customer that represents minimal customer lifetime value for the brand.

 

It would certainly behoove us to identify these customers and provide them with specialized messaging or a compelling promotion offer to keep them engaged and transacting with the brand.

 

Below, we could have a sample data set to which we fit a line, and thereby predict based on a certain age and AOV (average order value) profile whether a particular transaction is likely or not to be a “one and done” consumer.

 

2017-04-11-004-Multiple-Input-Classication-Machine Learning

 

In practice, we could potentially use a number of inputs to help solve machine learning problems. There are even methods to use an “unlimited” number of inputs- support vector machines. But only a tease for now!

 

Wrap-Up

Our first major classification of machine learning algorithms is supervised learning! In supervised learning, we assist the program by supplying the correct answers in part, and then mandating the program supply correct values via regression or classification, the two major categories of supervised machine learning problems.

 

2017-04-11-005-Major-Supervised-Learning-Algos

 

Moving forward, we’ll dive deeper into one variable linear regression (dare we say the hello world of machine learning?) as well as fleshing other key concepts and methods. If you’re interested in this, you might also be interested in learning PostgreSQL, how to develop Alexa skills, or algorithmic trading. Take care.

DSC_0024 Zach Doty Cover Photo for What is Machine Learning

What is Machine Learning?

Hello there, fellow Machine Learning (ML) students! Welcome back to our crash course in starting machine learning from an absolute beginner’s perspective.

In our previous article, we covered an introduction to Machine Learning, answering several key questions:

  • Where is machine learning used in our lives?
  • Where did machine learning come from?
  • Where is machine learning headed?

Forging ahead in our learning journey, we’ll introduce some definitions of machine learning and look at the major types of machine learning applications.

 

Machine Learning (ML), A Casual Definition by Arthur Samuel

Our first definition, teased in the last article, follows:

Machine learning is the practice of giving computers the ability to learn without being explicitly programmed to do so.

 

More on Arthur Samuel & Why His Definition on ML Matters

If you’re like me, you might not have heard of Arthur Samuel. Who is he, and why does his opinion matter in the fields of artificial intelligence and machine learning?

Arthur Samuel was a pioneer in artificial intelligence and computer gaming fields. In 1959, he coined the term “machine learning” as a founding father in the field. That’s why he’s important! Let’s also look at a more formal / scientific definition.

 

A More Formal Machine Learning Definition

Tom Mitchell, of Carnegie Mellon, offers a definition with more structure.

  • A well defined learning problem follows
  • E * T = P
    • Note: His definition does not include mathematical operators. I’m taking a large liberty to insert them myself. ¯\_(ツ)_/¯
  • Experience (E) placed against Task (T) is measured by Performance (P)

2017-04-06-001-Machine-Learning-Definition-ETP-Framework

Here’s a further example:

Example: playing Go.

E = the experience of playing many games of Go.

T = the task of playing and winning Go.

P = the probability that the program will win the next game.

 

Major Categories of Machine Learning Algorithms

If you judge by press coverage of ML as I have, it appears to be a nebulous field. (In all fairness, it may still be.) However, there is structure we can take in learning ML. There are two types of machine learning algorithms:

  • Supervised learning algorithms
  • Unsupervised learning algorithms

There are a couple of other prominent types of machine learning algorithms as well: reinforcement learning and recommender systems.

 

 Wrap-Up

Congratulations, we’ve cleared a very gentle introduction to machine learning, and it’s novice/high level definitions. I look forward to learning more with you, dear reader! Our next articles will cover a bit more detail about the two major ML algorithm types: supervised learning, and unsupervised learning. Until then, look after each other.

DSC_0104 Zach Doty Cover Photo for Learning LIKE SQL Statement

The LIKE Statement: SQL Statement Fundamentals

Howdy folks! We are overdue for another installation of SQL learning. I’ve slept a few times since the past couple of articles on how to learn SQL. Previously, we talked about the IN Statement, BETWEEN statement, and ORDER BY clause.

In this article, we’ll learn how to execute the LIKE statement in SQL queries. Let’s jump in.

 

About the LIKE Statement & Why it’s Important

Have you ever worked with a data set that’s overwhelmingly large  or complex? Or overwhelmingly large and complex? Sometimes, you need to find data, but can’t recall the exact string or values for a lookup. Or, perhaps, you’re working with a messy data output from say, Google Keyword Planner that groups a range of close variants into one value?

Say you’re looking for values related to designer clothing and designer clothes. Without a better solution, the most probable solution for you is to do a bunch of sorting, filtering, classifying and other data sleuthing at great expense to time and sanity.

The LIKE statement exists to help with the debacle of only having / knowing  part of the lookup criteria you need, courtesy of pattern matching.

PostgreSQL and many other SQL engines/platforms support the LIKE statement, which functions a bit like the below. (Pun not intended.)

 

SELECT keyword, search volume

FROM table

WHERE keyword LIKE ‘cloth%’

 

The above tells pgAdmin / PostgreSQL to get the keyword and search volume columns from table, where the keyword values match values that begin with ‘cloth’ and are followed by anything else, the percent sign. The combination of calling out a text string with an operate is known as a pattern.

When you execute the LIKE statement in a SQL query, pgAdmin will begin reading through the table rows to see if the pattern you’ve specified returns any matches. For the season marketing technology folks, this functionality sure does resemble regex in some ways.

However, there are some differences.

  • Here, instead of the * character being wildcard, the % sign serves as a wildcard matching all characters.
  • If you want to match a single character, the underscore character is used.

 

LIKE Statement Syntax & Examples

Let’s try some examples. Below, we’ll call on the faithful DVD rental practice database, and run a query for customers that have first names like Jen. (Jennifer, Jenny, etc.) Our below code produces the following result.

SELECT first_name,last_name

FROM customer

WHERE first_name LIKE ‘Jen%’;

2017-03-30-001-LIKE-SQL-Statement-Example

 

That being considered, there are other ways we can use the like statement. Above, we used a wildcard to match any endings to a particular string.

Conversely, we could execute a SQL query that specifies a certain ending value, with the wildcard preceding. If we extend that example to such a query below, we should see the following result:

SELECT first_name,last_name

FROM customer

WHERE first_name LIKE ‘%y’;

2017-03-30-002-LIKE-SQL-Statement-Example

 

Above, we’ve flipped the tables so we capture every possible beginning condition under this pattern. Also important to note, the patterns aren’t limited beginnings or ends. You can use this wildcard in the middle, etc. Consider the following:

SELECT first_name,last_name

FROM customer

WHERE first_name LIKE ‘%er%’;

2017-03-30-003-LIKE-SQL-Statement-Example

 

Now, on top of all this awesomeness, realize we’ve been using it as a matching filter. We can also employ the NOT LIKE syntax to exclude values meeting the pattern we’ve specified.

Let’s mix things up a bit. We mentioned a second type of pattern matching character that we haven’t used yet: the underscore.

SELECT first_name,last_name

FROM customer

WHERE first_name LIKE ‘_her%’;

2017-03-30-004-LIKE-SQL-Statement-Example

 

Above, we’ve made a similar ask. However, instead of requesting all possible matches, we’ve mandated SQL only return the ‘er’ string that begins with ‘h’.

To throw you a quick curveball, it’s worth noting the LIKE statement is case sensitive in its matching. Could be bad, could be good, could be neither. However, there’s a way for you to force case insensitivy on the queries. The difference in your statement appears relatively minor. Instead of using a function that calls LIKE or NOT LIKE, you’ll use ILIKE.

 

Wrap-Up

Hope you found this useful! Stay tuned for more SQL learnings and application. If you’re new here, visit the page on how to learn SQL. If you’re interested in more educational material, check out our ongoing series of how to develop Amazon Alexa voice search skills, and getting started with algorithmic trading. Cheers!

Building an Interactive Alexa Quiz Skill, Part 2

Disclaimer: this was typed late at night on a tired mind. Please excuse typos, convention errors, and generally poor writing. 🙂

Howdy, Alexa nerds! Welcome back to our journey in learning Amazon Alexa Skill Development. Quick funny aside, would you care to guess my most common use of the Echo? It’s to play looong Spotify playlists that are basically background noise to help our new dog when Hannah & go to the gym or meet friends. Anyway!

Let’s jump back in. The previous article covered setting up an AWS Lambda function for the Alexa Skill Service. Now, we’ll be working more with the Skill interface. See below for the conceptual overview, or an early article on building Alexa skill interfaces for a basic fact skill.

2017-03-07-001-Alexa-Skill-Interface-Develpoment-Framework

 

Working in the Amazon Developer Console: Alexa Skills Kit

You probably know the drill now, log in to the Amazon Developer console. Once you’ve logged in, select the “Alexa” menu item from the home screen, then choose the “Alexa Skills Kit” Option.

2017-03-27-B-001-Alexa-Quiz-Skill-Create-Alexa-Skill-Interface-Start

If you’ve previously published or started development of skills, you should see them listed on this screen. Now, click, “Add a New Skill”. We should be looking at a very familiar screen here. 🙂

Add/edit the following:

  • Language (assuming you’ll leave the English US default here)
  • Name of the Skill displayed in the Alexa app and store
  • Invocation name users will speak to start your skill

Click Save and Next to proceed.

2017-03-27-B-002-Alexa-Quiz-Skill-Create-Alexa-Skill-Information

 

Working with the Interaction Model

Here comes the tough part, more copy and paste! Okay, sarcasm and humor doesn’t always translate well via text. We’re going to continue to lean fairly heavily on Amazon’s examples here to get ourselves familiarized with the more advanced concepts of intent schema and slot types.

That caveat aside, head back to the files we originally downloaded, but this time, we’re interested in the speechAssets folder and its contents:

  • json
  • Sample utterances (text document)

First, let’s open up the Intent Schema JSON file in our text editor of choice. Below, a look at what you should approximately be seeing. Copy and paste the entirety of the JSON file into the Intent Schema field of the Interaction Model tab.

2017-03-27-B-003-Alexa-Quiz-Skill-Create-Alexa-Skill-Intent-Schema-JSON

Audible: Our First Encounter with Custom Slot Types

Alright, no smooth segue here. We’re having the first encounter with what’s known as custom slot types.

If you were to try and save the skill progress so far, you’ll receive an error message from the developer console that says something like, “Error: There was a problem with your request: Unknown slot type ‘LIST_OF_ANSWERS’ for slot ‘Answer’. Why is that?

2017-03-27-B-004-Alexa-Quiz-Skill-Create-Alexa-Skill-Custom-Slot

If you take a closer look at the Intent Schema JSON file, you’ll notice that  most of the intents are built-in Amazon intents. E.g., “intent”: “AMAZON.RepeatIntent”. The “AnswerIntent” looks nothing like the built-in Amazon intents. Instead, we see a name, “Answer” and type, “LIST_OF_ANSWERS” that was so delicately referenced in the error message.

So how do we remedy this situation? We use the information presented to us in the error message and the JSON file to work our way over this issue. You’ll likely note under the custom slot types mentions “Enter Type”.

2017-03-27-B-005-Alexa-Quiz-Skill-Create-Alexa-Skill-Custom-Slot-Values

Match that information up with our error message and the JSON code, and we’ll enter, “LIST_OF_ANSWERS”. In the values section, we’ll enter on separate lines: 1, 2, 3, and 4. I’ll note here for clarity, that this essentially corresponds to the A/B/C/D multiple choice functionality of the quiz. We’ll see this in greater detail in a bit.

Okay, click “Add” as highlighted above, then click “Save”. Next, return to your files and open up the Sample Utterances text file. You should see something like the below.

2017-03-27-B-006-Alexa-Quiz-Skill-Create-Alexa-Skill-Sample-Utterances

You’ll note it’s quite a bit different than the previous, simple, fact-dispensing Skill we previously built. Take note of the {Answer} sample utterance. These are the pieces of dynamic input and interaction coming together into an Alexa Skill. We’ve defined a custom interaction outside of Amazon’s standard functions, and specified a range of acceptable answers the user can give us. That whole structure meets the user experience here, called in by the {Answer} slot name and custom slot type.

Enough conceptual babble. Copy and paste the sample utterances text into the developer console! Click Save beneath the sample utterances, and click Next.

2017-03-27-B-007-Alexa-Quiz-Skill-Create-Alexa-Skill-Sample-Utterances-List

 

Continuing the Skill Interface Build-out

Alright, so far, we’ve accomplished the following:

  • Provided basic skill information about our new skill
  • Specified details about the interaction model, including;
    • Intent Schema
    • Custom Slot Type
    • Acceptable/specified values for the custom slot type
    • Sample utterances

Next, we need to fill in some simple but crucial configuration details. Remember the ARN we generated by setting up the AWS Lambda function in the previous article? You need it here. Below you can see:

  1. I’ve selected the recommended endpoint type of AWS Lambda ARN
  2. Selected my geographic region of North America and,
  3. Pasted in the full ARN

I’m not going to work with account linking yet, because honestly, it looks really darn complicated and its well past midnight as I type this. Soon. 🙂 Click next and proceed to the testing tab!

In the testing tab, you should first see that the skill is enabled for testing on your account. You can:

  • play back responses from Alexa in the voice simulator to test pronunciation, etc
  • More importantly, use the service simulator to run a sample utterance, and see if your skill actually works.

2017-03-27-B-008-Alexa-Quiz-Skill-Testing-Alexa-Skill

Above, we can see the response to our sample utterance asking SEO Quiz returns as expected. Woohoo! Also, did you know the Alexa voice simulator automatically bleeps out most curse words? Did you know you can kind of work around that by putting extra vowels in the word? I digress. (It’s almost 1 am writing this now, productivity on the rise!) When you’re satisfied, click Next.

We’re getting close! Time to enter some publishing information. I’ll leave the first few sections to you: Category, Sub-Category, Testing Instructions, Country/Region availability, Short and Full skill descriptions.

Now, in the example phrases, I provided some updates to the sample utterances, namely to the starting Intent. Below, see the example phrases of “Alexa open SEO Quiz” and so forth. The “gotcha” here that set me back on my first skill is that the example phrases must be derived from your sample utterances.

2017-03-27-B-009-Alexa-Quiz-Skill-Testing-Alexa-Skill

Upload your 108×108 and 512×512 pixel icon images, click Next and submit the requisite privacy & compliance information. Done!

 

Wrap-Up

So, we’re mostly done, not completely done. The part for usto do now is customizing the template code in your AWS index.js file. Ideally, I would prefer a more eloquent closing, but it’s late, will have to wait for another time. Look after each other.

 

DSC_0002 Zach Doty Cover Photo for Interactive Alexa Quiz Skill Development

Building an Interactive Quiz Alexa Skill, Part 1

Hello Alexa geeks! Welcome back to our journey of learning how to develop Amazon Alexa Skills for the Echo and more. Last time, we completed the build process for our first simple “fact-dispensing” Alexa Skill.

In this article, we’ll start the process for a skill that accepts user input in the form of a quiz, fun! If you recall from our first skill, there are two parts to skill development:

  1. The Skill service development, in AWS (Lambda)
  2. The Alexa Skill interface details through the Amazon / Alexa Developer Console

 

2017-03-01-001-Alexa-Skill-Develpoment-Framework

 

Getting Started in AWS Lambda

You’ll notice as we progress from our early articles, there will be less detail paid to more basic instructions, such as our first! First, log in to the Amazon AWS portal.

Navigate to the Lambda service. If you’re the casual developer just working in this course, odds are the Lambda link will be near the top of screen under “Recently Visited Services”. Once you’ve clicked through, click, “Create a Lambda Function”.

2017-03-27-001-Alexa-Quiz-Skill-Create-AWS-Lambda-Function

On the next screen, you should see something like “Select blueprint” (Note: at the rate of change Amazon has been pursuing, this screen could change, even in a matter of weeks!) Click the “Blank Function” option, we’re starting this one from scratch!

2017-03-27-002-Alexa-Quiz-Skill-Create-Blank-AWS-Lambda-Function

The next screen should be, “Configure triggers”. Click inside the gray dash-outlined box, and select, “Alexa Skills Kit” from the dropdown menu. Click next!

2017-03-27-003-Alexa-Quiz-Skill-Configure-Alexa-Skills-Kit-Trigger

 

AWS Lambda Function Configuration for Alexa

Now we should be able to configure the basics of our function. Enter the following:

  • Function name
  • Description

The default runtime environment should be Node.js 4.3. If not, change it to Node.js 4.3.

(Note: Amazon just introduced support for Node.js 6.10, so that may be the preferred format going forward!) Will try to provide an article update, should that be the case.

2017-03-27-004-Alexa-Quiz-Skill-Configure-Alexa-Function

Onward! Now, we need to upload some code to this burgeoning success. Throwback time, do you remember the files we downloaded in one of the first articles? Time to go back to them again. In your folder of numbered skill templates, go to “2-reindeerGames”, “src” folder and safely open the index.js file in your text editor of choice.

2017-03-27-005-Alexa-Quiz-Skill-Lambda-Function-Code

Copy and paste (replacing all previous code) into the code window that should appear.  This assumes you’ve selected the Code entry type of “Edit code inline” for the Lambda function code. As we work on more increasingly more advanced skills, we will likely use the zip upload feature to accommodate additional code resources. The astute will note we’ve merely copied and pasted code here. Yes, we’ll go back and customize soon. 🙂

Beneath the code window, leave the index.handler intact, select an existing role option in the Role dropdown menu, and use the role we previously created. Leave the other settings as-is, click the “Next” button to review details, and click, “Create Function”!

2017-03-27-006-Alexa-Quiz-Skill-Lambda-Function-Creation

Be sure you take note / record the ARN in the upper right-hand corner, as we’ll need that in our forthcoming Skill Interface development section.

Wrap-Up

That’s the first part! I don’t know about you, but this is getting easier as I go. We’ll next cover the skill interface and customization to make it your own skill. If this is your first article, be sure to check out the running stable of articles on how to learn Amazon Alexa skill development. Also, there’s a growing body of work on how to learn PostgreSQL, and some fledgling articles on learning algorithmic trading, for good measure.

Share your experience, thoughts and feedback in the comments below. Don’t be a stranger, help your friends along in Alexa Skill development and share with them. Cheers!

 

 

 

 

 

 

How to Build Your First Amazon Alexa Skill, Pt. 2: Skill Interface

Welcome back, Alexa geeks! In the last article, we laid the groundwork for making our first Amazon Alexa skill. We covered the concepts and frameworks in Alexa skill development. We also did some work in AWS Lambda to prepare for voice requests  being made to our service.

 

Quick Recap on The Alexa Skill Service / AWS

We’ve slept a few times since covering the concepts and AWS framework, so let’s quickly recap.

2017-03-01-002-Alexa-Skill-Process-Framework

Whenever we use an Alexa skill , our voice data is processed through the hardware device, through the skill interface (what we’re looking at today) for language processing, then converted into text for a program to execute against, and back again. Simple enough, right?

In the last article, we looked at the “last” leg of this process, the skill service and AWS Lambda. Now, we’ll be working with that and the skill interface portion.

2017-03-07-001-Alexa-Skill-Interface-Develpoment-Framework

 

Setting Up the Basic Skill Information

Alright, log into the Amazon Developer Portal and Select the Alexa menu option. Once there, you should have a choice between the Alexa Skills Kit or the Alexa Voice Service. Click on the Alexa Skills Kit link to continue. You’ll want to click, “Add a New Skill”. Because I’ve already developed a skill, Silly Marketing Strategies is already there. However,  most of y’all will probably not have anything else on screen.

2017-03-07-002-Alexa-Skill-Interface-Develpoment-Framework

To start, you’ll want to select Custom Interaction Model, leave the default language to US (unless you’re developing in Espanol?). Type out the name and invocation name for your skill. The name isn’t necessarily important this moment. However, the invocation name will be extremely important!! This will be how users will call your Skill into service.

Note: the astute will notice I’ve deviated from the Space Geek example of the last article. More on that later. 🙂

When you’re ready, click next to proceed, and we’ll tackle the interaction model.

2017-03-07-003-Alexa-Skill-Interface-Develpoment-Framework

 

Setting up the Skill’s Interaction Model

Alright, time for some important things here. For the purposes of the factoid-based game, we’ll only be looking at the Intent Schema and Sample Utterances fields.

First, we need to look at the Intent Schema. Remember the files you downloaded in the first article? You’ll need to go into the SpaceGeek folder, Speech Assets sub folder and open the IntentSchema.json file. I’ve opened it up in Sublime Text 2 briefly, so we can take a quick look at the file. So this is JSON, with intent pairs. Below, we’ve got a pretty simple set, intents for retrieving a fact from the skill, getting help, stopping and cancelling a function. It’s easy right? Ha.

2017-03-07-004-Alexa-Skill-Interface-Develpoment-Intent-Schemas

Quick note: because these intents are proceeded by Amazon – it means they’re built in for Amazon. Enough babbling, copy and paste the contents of this file into the Intent Schema section of Amazon Developer Console.

2017-03-15-005-Alexa-Skill-Interface-Develpoment-Intent-Schemas-Utterances

Above, we’ve pasted in the Intent Schema. Next, we need to provide some sample utterances. Sample utterances are what you think users might say to engage your Alexa Skill. Below, we’ve provided such examples as, “Tell me a Weimaraner fact”.

Next, we need to hook up the Alexa Skill interface we’ve put together with some computing power. Specifically, we need to hook it up to AWS Lambda! (Remember the first article where we did a bunch of Lambda setup?)

If you recall, there was an Amazon Resource Name (ARN) string that we copied and saved to a text file. Retrieve it now and paste into the “Configuration” screen of the Skill interface setup.

2017-03-15-006-Alexa-Skill-Interface-Develpoment-Intent-Schemas-Utterances

Providing the ARN you’ve provided is valid, you should be able to proceed to the next step. Note: we are ignoring the account linking functionality for now. This functionality allows you, for example, to integrate Twitter sharing functionality into your skill by sharing a Tweet.

Next, we’ll move on to the Test tab. Three things (below) to take note of:

  1. Ensure you’ve completed the Interaction Model tab, so you can complete the testing in this tab.
  2. Try / type out key phrases in your skill to hear how they’ll be pronounced, via the Voice Simulator.
  3. Enter some utterances into the Service Simulator to A) make sure your skill is functioning as intended and B) Get a feel for how the end user experience will happen

2017-03-18-006-Alexa-Skill-Interface-Develpoment-Testing-Tab

Once you’ve tested your Skill, proceed to Publishing Information. Here, you’ll need to include the following:

  • Category of your Alexa SKill
  • The relevant Sub Category
  • Optional: testing instructions if your skill requires credentials or other unusual needs. You probably don’t need to include anything for this example
  • Country & Region targeting
  • Short Skill Description
  • Full Skill Description
  • Example Phrases, drawn from your sample utterances, and preceded by the Alexa wake word and skill invocation name
  • Optional: keywords that will help Alexa users find your skill in search
  • Images in 108 and 512 pixel squared dimensions

Below are a couple of screenshots for how I’ve filled out these fields.

2017-03-18-007-Alexa-Skill-Interface-Develpoment-Publishing-Information

2017-03-18-008-Alexa-Skill-Interface-Develpoment-Publishing-Information

Alright, we’re so close! The last field is Privacy & Compliance. For this example, you should be checking “No” to all the radio buttons:

  • No, skill doesn’t allow users to make purchases or spend real money
  • No, skill does not collect personal information from users
  • No, skill does not target children under the age of 13

However, do note that some of these things (except for child targeting) may change as we progress in our Alexa Skill development capabilities.

If desired (or later required by more advanced capabilities) you may specify privacy policy and terms of use URLs.

If you’ve completed the above, you should be good to Save & Submit for Certification!

 

Wrap Up

This is a deceivingly involved process. You will note that neither in the previous article, nor this article, have we changed the original source code for the SpaceGeek / Weimaraner Facts Skill. We’ll cover this in more detail in the next article. Until next time, check out my journey in learning NoSQL and keep an eye out for more content soon!

 

How to Build Your First Amazon Alexa Skill, Part 1 – The Skill Service

Introductory Disclaimer: AWS is changing, and changing fast! Between developing my first skill in mid-January 2017 and now going back to learn more / teach, quite a lot has changed in the AWS developer areas. All that to say is that I’ve already had to cobble screen shots together to make this article work, and the details could quickly become outdated. I don’t really intend to update this frequently, once I’ve completed my learnings. Apologies!

Welcome back, Amazon Alexa geeks! First, thanks for both your patience and kind feedback on the process of building your own Alexa and testing your DIY Alexa. Honestly, that was one of the hardest challenges I’ve had since first learning HTML / CSS coding back in college. (Thanks Raspian / Linux command line…)

Today, we’re getting back into actual Alexa skill development. We’ll be building your first (my second) Amazon Alexa skill for use on Echo and other devices.

Side note: I have previously developed an unremarkable skill, Silly Marketing Strategies, but I want to start fresh to:

  1. A) do it better, and,
  2. B) take a first stab at incorporating analytics capabilities into Alexa skills.

Without further ado, let’s get started with a conceptual overview to how Alexa skills work, and where developers (us!) play.

Quick Concept Introduction

First, let’s look at a simplified process of you using an Alexa skill:

2017-03-01-002-Alexa-Skill-Process-Framework

  • You’ll say something like, “Alexa, play Road Trip Country Playlist on Spotify”
  • The Echo, Fire TV, Raspberry Pi or other hardware takes in the audio and routes it into software/programs
  • The Alexa Skills Kit Interface employs speech recognition and natural language processing/understanding to convert your speech into text strings, then to code
  • The Skill service, facilitated by AWS Lambda processes the text strings against it’s suite of skills and programs, and outputs code,
  • Which is fed back through the programs to hardware and to a lovely voice response, something like, “Resuming your queue…”

That’s the front end experience. As a developer, we’ll hone in toward the “last” two parts of user experience. In that frame of reference, we start at the logic of our program, working “backwards” to handling user input, and the desired result.

2017-03-01-001-Alexa-Skill-Develpoment-Framework

Setting up AWS Lambda

Are we there yet? Okay, for real this time, let’s hop into code…by pulling up 2 URLs and logging in:

  1. https://aws.amazon.com
  2. https://developer.amazon.com/public/solutions/alexa

In the AWS console, you should see a screen like the below. First, update your region (top right) to US East, N. Virgnia. Why, you might ask? It’s the only region (currently, Mar. 2017) that supports Alexa.

2017-03-01-003-Alexa-Skill-Logging-In-AWS-Choose-Region

Next you need to select Lambda from the list of AWS services. It should be readily available on your screen. Mine will likely look a bit different from having used it already. 🙂

If you’re a first-time Lambda user, you’ll likely see a welcome/splash screen that resembles the following:

You’ll have the option to select a blueprint (optional). You should be able to click next, or, there should be an Alexa skills kit SDK fact skill option you can select.

Moving along, when you launch Lambda for the first time, you’ll need to configure triggers. Below, you’ll want to select the Amazon Skills kit, and click, “Next”. Why are we configuring triggers for functions? Because our Alexa Skill is event-based, and only triggers when an event pertaining to it occurs.

2017-03-01-005-Alexa-Skill-AWS-Triggers

You should now land on a Configure Function screen. These are the baby steps to making a skill! Give the skill a name and description. We’ve used SpaceGeek as values below, since we’ll be using the SpaceGeek template at first. Also, always select Node.js 4.3 for Alexa Skill Development.

2017-03-01-006-Alexa-Skill-AWS-Lambda-Functions

Beneath the basic naming and settings is a Lambda function code section. Choose the “Upload a .ZIP file” option. Remember all the files we downloaded in our first article? You’ll need to go the SpaceGeek folder, src subfolder and zip the contents (shown below) and upload accordingly.

2017-03-01-007-Alexa-Skill-AWS-Lambda-Alexa-Skill-Sample-Files

(You can download the sample files again if needed from here.

Note: It’s March 2017, I started in Jan. 2017 and the original tree /Github URL has already been deprecated. Zoinks, it’s moving fast!)

After you’ve uploaded the file, you need to create a new role from templates (beneath the file upload you just completed.) Feel free to name the role whatever you like, but the important detail here is that you select the S3 object read-only permission option under the policy templates dropdown selection.

2017-03-01-008-Alexa-Skill-AWS-Lambda-Settings

Finally! You should be able to click, “Next” and clear this screen. You should see a review screen confirming the selections you’ve made. Click, “Create Function” and proceed! You should land within the Function itself. Since my new experience was already spent, here’s what the equivalent screen looks like, for Silly Marketing Strategies.

2017-03-01-009-Alexa-Skill-AWS-Lambda-SMS-Function

Last thing you’ll want to do is copy the ARN into a text file, word document, etc. for safekeeping down the line. You’ll need it soon.

Wrap Up

Alright, we made a ton of progress in today’s article! I was hoping to make more progress, but the process of organizing, documenting, processing and writing is quite time consuming. It’s now 1 a.m. local time for me, and I need sleep for a long day at work tomorrow! We’ll be back very soon on how to work on the Skills Interface (utterances, logic, etc.) in more detail to publish your first skill! If you’re new or need a refresh, here’s our running list of articles on how to learn Amazon Alexa Skill development.