Voice Search Strategy & Optimization for SEO

When’s the last time you thought about voice search strategy & optimization for SEO? At the time of writing this (summer 2021) everyone I know (myself included) are pretty occupied with Core Web Vitals and Page Experience rolling out, alongside other algo updates. (See also: Google Traffic Down!)

But it’s refreshing to spend some time thinking about the future, or at least long tail search. 🙂

I saw an interesting post yesterday on voice search optimization and SEO on LinkedIn. Some folks found really good use of the newfound Regex functionality in Google Search Console to use a bunch of OR statements to get data on conversational keywords.

Only one problem for the seasoned optimization pro. It does nothing for problems of scale. Looking to get more than a few hundred keywords at a time? Out of luck. Looking for keywords by page? Unlikely to happen quickly.

There are even tons of resources out there, dicing voice search strategy and optimization 27 ways to Sunday. If Brian Dean is out writing a guide on it, you have to know there’s interest for it! But again, a lot of the most visible material gives a lot of basic insight, and then non-scalable instructions on pulling conversational keywords out of Search Console. How can we do better?

Solving for Scale in Voice Search Strategy & Optimization for SEO

What’s an enterprise digital marketer and SEO to do? Sounds like it’s time to revisit our post on how to download from Google Search Console & export all your data! More specifically, if you need scale, that means automation, and automation means rolling up your coding sleeves. 🙂

# Load packages #
library(searchConsoleR)
library(dplyr)
library(ggplot2)
library(writexl)
library(stringr)

# Authorize & choose Google profile #
scr_auth()

# Specify website --- Client Code or Line of Business Code Here #
client <- "CLCODE"
website <- "https://www.twosigma.com"

############ SPECIFYING THE MONTHS ############
start1 <- "2020-04-01"
end1 <- "2020-04-30"

start2 <- "2020-05-01"
end2 <- "2020-05-31"

start3 <- "2020-06-01"
end3 <- "2020-06-30"

start4 <- "2020-07-01"
end4 <- "2020-07-31"

start5 <- "2020-08-01"
end5 <- "2020-08-31"

start6 <- "2020-09-01"
end6 <- "2020-09-30"

start7 <- "2020-10-01"
end7 <- "2020-10-31"

start8 <- "2020-11-01"
end8 <- "2020-11-30"

start9 <- "2020-12-01"
end9 <- "2020-12-31"

start10 <- "2021-01-01"
end10 <- "2021-01-31"

start12 <- "2021-02-01"
end12 <- "2021-02-28"

start13 <- "2021-03-01"
end13 <- "2021-03-31"

start14 <- "2021-04-01"
end14 <- "2021-04-30"

start15 <- "2021-05-01"
end15 <- "2021-05-31"

start16 <- "2021-06-01"
end16 <- "2021-06-30"

# Specify the voice cue variables for dataframe functionality #
who <- "who" # 1
what <- "what" # 1
when <- "when" # 1
where <- "where" # 1
why <- "why" # 1
how  <-  "how" # 1
does <- "does" # 1
have <- "have" # 1
has <- "has" # 1
are <- "are" # 1
which <- "which" # 1
will <- "will" # 1
was <- "was" # 1
should <- "should" # 1
would <- "would" # 1
could <- "could" # 1
near <- "near" # 1
google <- "google" # 1 
alexa <- "alexa" # 1 
do <- "do" # 1
is <- "is" # 1
can <- "can" # 1


# #
# #            _____
# # |\    /|  |     |     1
# # | \  / |  |     |
# # |  \/  |  |     |
# # |      |  |     |
# # |      |  |     |
# # |      |  |     |
# # |      |  |_____|
# #
# #


# # # Start who 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
who_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~who"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(who_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(who,rows)
# primaryCue

# Make a data frame from the created vectors
who_queries_merged_1 <- data.frame(clientName,date,primaryCue,who_queries)
# Testing df creation
# pivotData
# head(who_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(who_queries_merged, "who_queries_1.xlsx")

# # # END who 1
# # #
# # #
# # #
# # #


# # # Start what 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
what_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~what"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(what_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(what,rows)
# primaryCue

# Make a data frame from the created vectors
what_queries_merged_1 <- data.frame(clientName,date,primaryCue,what_queries)
# Testing df creation
# pivotData
# head(what_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(what_queries_merged, "what_queries_1.xlsx")

# # # END what 1
# # #
# # #
# # #
# # #


# # # Start when 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
when_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~when"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(when_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(when,rows)
# primaryCue

# Make a data frame from the created vectors
when_queries_merged_1 <- data.frame(clientName,date,primaryCue,when_queries)
# Testing df creation
# pivotData
# head(when_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(when_queries_merged, "when_queries_1.xlsx")

# # # END when 1
# # #
# # #
# # #
# # #


# # # Start where 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
where_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~where"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(where_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(where,rows)
# primaryCue

# Make a data frame from the created vectors
where_queries_merged_1 <- data.frame(clientName,date,primaryCue,where_queries)
# Testing df creation
# pivotData
# head(where_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(where_queries_merged, "where_queries_1.xlsx")

# # # END where 1
# # #
# # #
# # #
# # #


# # # Start why 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
why_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~why"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(why_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(why,rows)
# primaryCue

# Make a data frame from the created vectors
why_queries_merged_1 <- data.frame(clientName,date,primaryCue,why_queries)
# Testing df creation
# pivotData
# head(why_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(why_queries_merged, "why_queries_1.xlsx")

# # # END why 1
# # #
# # #
# # #
# # #


# # # Start how 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
how_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~how"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(how_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(how,rows)
# primaryCue

# Make a data frame from the created vectors
how_queries_merged_1 <- data.frame(clientName,date,primaryCue,how_queries)
# Testing df creation
# pivotData
# head(how_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(how_queries_merged, "how_queries_1.xlsx")

# # # END how 1
# # #
# # #
# # #
# # #


# # # Start Does 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
does_queries <- search_analytics(
                   siteURL = website, 
                   startDate = start1,
                   endDate = end1,
                   dimensions = c("query", "page"),
     ##### REPLACE QUERY #####
                   dimensionFilterExp = c("query~~does"),
                   searchType="web",
                   rowLimit = 1000000
                   )

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(does_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(does,rows)
# primaryCue

# Make a data frame from the created vectors
does_queries_merged_1 <- data.frame(clientName,date,primaryCue,does_queries)
# Testing df creation
# pivotData
# head(does_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
# write_xlsx(does_queries_merged, "does_queries_1.xlsx")

# # # END DOES 1
# # #
# # #
# # #
# # #


# # # Start Have 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
have_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~have"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(have_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(have,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
have_queries_merged_1 <- data.frame(clientName,date,primaryCue,have_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(have_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(have_queries_merged_1, "have_queries_1.xlsx")

# # # END HAVE 1
# # #
# # #
# # #
# # #


# # # Start has 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
has_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~has"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(has_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(has,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
has_queries_merged_1 <- data.frame(clientName,date,primaryCue,has_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(has_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(has_queries_merged_1, "has_queries_1.xlsx")

# # # END has 1
# # #
# # #
# # #
# # #


# # # Start Are 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
are_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~are"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(are_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(are,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
are_queries_merged_1 <- data.frame(clientName,date,primaryCue,are_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(are_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(are_queries_merged_1, "are_queries_1.xlsx")

# # # END ARE 1
# # #
# # #
# # #
# # #


# # # Start Which 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
which_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~which"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(which_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(which,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
which_queries_merged_1 <- data.frame(clientName,date,primaryCue,which_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(which_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(which_queries_merged_1, "which_queries_1.xlsx")

# # # END WHICH 1
# # #
# # #
# # #
# # #


# # # Start Will 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
will_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~will"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(will_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(will,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
will_queries_merged_1 <- data.frame(clientName,date,primaryCue,will_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(will_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(will_queries_merged_1, "will_queries_1.xlsx")

# # # END will 1
# # #
# # #
# # #
# # #


# # # Start was 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
was_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~was"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(was_queries)
# Testing purposes - making sure rows vector was properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(was,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
was_queries_merged_1 <- data.frame(clientName,date,primaryCue,was_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(was_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(was_queries_merged_1, "was_queries_1.xlsx")

# # # END was 1
# # #
# # #
# # #
# # #


# # # Start should 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
should_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~should"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(should_queries)
# Testing purposes - making sure rows vector should properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(should,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
should_queries_merged_1 <- data.frame(clientName,date,primaryCue,should_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(should_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(should_queries_merged_1, "should_queries_1.xlsx")

# # # END should 1
# # #
# # #
# # #
# # #


# # # Start would 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
would_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~would"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(would_queries)
# Testing purposes - making sure rows vector would properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(would,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
would_queries_merged_1 <- data.frame(clientName,date,primaryCue,would_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(would_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(would_queries_merged_1, "would_queries_1.xlsx")

# # # END would 1
# # #
# # #
# # #
# # #


# # # Start could 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
could_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~could"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(could_queries)
# Testing purposes - making sure rows vector could properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(could,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
could_queries_merged_1 <- data.frame(clientName,date,primaryCue,could_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(could_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(could_queries_merged_1, "could_queries_1.xlsx")

# # # END could 1
# # #
# # #
# # #
# # #


# # # Start near 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
near_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~near"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(near_queries)
# Testing purposes - making sure rows vector near properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(near,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
near_queries_merged_1 <- data.frame(clientName,date,primaryCue,near_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(near_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(near_queries_merged_1, "near_queries_1.xlsx")

# # # END near 1
# # #
# # #
# # #
# # #


# # # Start google 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
google_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~google"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(google_queries)
# Testing purposes - making sure rows vector google properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(google,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
google_queries_merged_1 <- data.frame(clientName,date,primaryCue,google_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(google_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(google_queries_merged_1, "google_queries_1.xlsx")

# # # END google 1
# # #
# # #
# # #
# # #


# # # Start alexa 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
alexa_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~alexa"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(alexa_queries)
# Testing purposes - making sure rows vector alexa properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(alexa,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
alexa_queries_merged_1 <- data.frame(clientName,date,primaryCue,alexa_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(alexa_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(alexa_queries_merged_1, "alexa_queries_1.xlsx")

# # # END alexa 1
# # #
# # #
# # #
# # #


# # # Start do 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
do_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~do"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(do_queries)
# Testing purposes - making sure rows vector do properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(do,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
do_queries_merged_1 <- data.frame(clientName,date,primaryCue,do_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(do_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(do_queries_merged_1, "do_queries_1.xlsx")

# # # END do 1
# # #
# # #
# # #
# # #


# # # Start is 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
is_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~is"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(is_queries)
# Testing purposes - making sure rows vector is properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(is,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
is_queries_merged_1 <- data.frame(clientName,date,primaryCue,is_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(is_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(is_queries_merged_1, "is_queries_1.xlsx")

# # # END is 1
# # #
# # #
# # #
# # #

# # # Start can 1
# # #
# # #
# # #
# # #

############ REPLACE QUERY  ############
can_queries <- search_analytics(
  siteURL = website, 
  startDate = start1,
  endDate = end1,
  dimensions = c("query", "page"),
  ##### REPLACE QUERY #####
  dimensionFilterExp = c("query~~can"),
  searchType="web",
  rowLimit = 1000000
)

# Dynamically get the number of rows from the GSC export so the created vectors match number of rows
############ REPLACE QUERY  ############
rows <- nrow(can_queries)
# Testing purposes - making sure rows vector can properly formed
# rows

# Make a vector that populates the client code
clientName <- rep(client,rows)
# Testing vector creation
# clientName

# Make a vector that populates the start date for Month filtering, segmenting, pivoting
########                       ########
############ REPLACE DATE  ############
date <- rep(start1,rows)
# Testing vector creation
# date

############ REPLACE QUERY  ############
primaryCue <- rep(can,rows)
# primaryCue

# Make a data frame from the created vectors
############ REPLACE QUERY  ############
can_queries_merged_1 <- data.frame(clientName,date,primaryCue,can_queries)
# Testing df creation
# pivotData
############ REPLACE QUERY  ############
# head(can_queries_merged_1,5)

############### QA CHECKPOINT ######################
# Write the data frame to an XLSX file
############ REPLACE QUERY  ############
# write_xlsx(can_queries_merged_1, "can_queries_1.xlsx")

# # # END can 1
# # #
# # #
# # #
# # #

month_1_queries <- rbind(who_queries_merged_1,what_queries_merged_1,when_queries_merged_1,where_queries_merged_1,why_queries_merged_1,how_queries_merged_1,does_queries_merged_1,have_queries_merged_1,has_queries_merged_1,are_queries_merged_1,which_queries_merged_1,will_queries_merged_1,was_queries_merged_1,should_queries_merged_1,would_queries_merged_1,could_queries_merged_1,near_queries_merged_1,google_queries_merged_1,alexa_queries_merged_1,
do_queries_merged_1,is_queries_merged_1,can_queries_merged_1)

write_xlsx(month_1_queries, "month_1_queries.xlsx")
# Write the data frame to an XLSX file
# write_xlsx(does_queries, "does_queries.xlsx")

You can also find the code in a Github gist.

Closing Thoughts on Voice Search Strategy & SEO

Stay tuned on this space! I’ll be publishing some more insight soon (ish) on how this work is valuable to your digital marketing agency’s strategy & planning, even if your voice search optimization doesn’t quite speak up for voice as soon as we think. This

Basic Command Line Commands

Let’s talk basic command line commands today! I’ve shared some recent command line articles on Google PageSpeed Insights on Command Line (CLI) and Running Google Lighthouse programmatically, in addition to older posts like Screaming Frog Command Line (CLI) automation.

I want to share some more fundamental components of command line and batch scripting: basic command line commands! (Spoiler alert: this article, like the others only covers Windows Command Line, not Mac or Linux.)

Getting Started With Command Line / CMD

To do anything with command line commands, you’ll first need to open up the Command Prompt! This is an easy start: click into your search bar (if you have Windows 10 or recent) and type “CMD.”

The Command Prompt app should pop up as the best match – click it!

Opening up command prompt for basic command line commands

I certainly don’t recommend doing this every time. For convenience, you can right click the Command Prompt icon in your taskbar once open, and pin it for quick access. (Similar for Start Menu if that’s your fancy.)

Once Command Prompt is open, you should see some information about the version of Windows you’re running, the copyright statement, and your default current directory below.

The Command Prompt (CMD) for basic command line commands

Basic Commands & Parameters in CLI / CMD

Surprisingly, I feel there’s a gentle learning curve in using command prompt vs. some other programming languages or settings. One of the first things you can do learn the basic “super user” kind of commands in command line.

What’s Here? Using the “dir” command

The “dir” command in CLI/CMD isn’t far from Hello world! The “dir” command will show you information about the folders and files in your current directory.

You may often find that the default current directory for Command Line is C:\Users\[YOUR USER NAME HERE]

The line containing the single period represents the current directory, and the line containing two periods represents the parent directory above your current. (More often than not, this will be the C:\Users\ directory.)

From left to right, the dates and times should represent when the files or folders were created. If the line is a folder, you’ll see the “<DIR>” notation. If the line is a file, the file size in bytes will be shown, along with the folder or file name on the far right. At bottom, a summary line will be displayed.

I need somebody (Help!) Not just anybody

Like any programming language, there are ton of commands, options, functions – you name it! For command line commands, just add help to your command, and when you run help + command, you should get back instructions and help on parameters for the command.

Helpers: Parameters in Basic Command Line Commands

Parameters are the first building block to doing fun things with basic command line commands. Parameters may also be known as options or arguments, depending on the setting.

Parameters can take the shape of words, symbols or letters, or sometimes a mix of these. Adding parameters can help you customize the output of your commands in command line.

Google PageSpeed Insights on Command Line (CLI)

Welcome back to more in adventures in Command Line! Previously, we were looking at running Google Lighthouse programmatically.

Today, we’re going to explore a similar exercise- running Google PageSpeed Insights on Windows Command Line (CLI)! Running PageSpeed insights on Command Line was probably the easiest self-coding experience I’ve had to-date. (More on the hard part later. 🙂 )

3 Prep Steps for Running PageSpeed Insights on Command Line

Unlike Screaming Frog Command Line automation, there are very few parameters or options to mess with for running data on a single URL. To run a single URL on PageSpeed Insights from Command Line:

  • Ensure CURL works in your command line (test CURL https://www.sampledomainhere.com) and check that the HTML source code is returned
    • Most recent Windows machines should already have CURL ready out of the box for CLI usage
  • Visit the Google PageSpeed Insights API documentation page
    • Get an API key (free) and create a project for credentials (also free) if needed
  • Copy the code sample into a code editor or text editor

Running Your First URL on PageSpeed Insights CLI

Is CLI for PageSpeed insights easy? Yes!

Effortless? No.

Here’s the out-of-the-box code sample from Google:

curl https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://developers.google.com

If you try to run that out of the box on Windows, you’ll like run into one or more of these errors:

  • 429 error: Too many requests / exhausted
  • Invalid syntax: Command(s) not recognized

First, you’ll need to append your API key to the end of the URL.

curl https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://www.zldoty.com&key=yourAPIKeyHere

Here’s what did the trick for me: placing escape characters in front of any special operators. Specifically, the ampersand (&) and equals (=) signs have special functions in CMD that need to be escaped with a carrot (^).

curl https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url^=https://www.zldoty.com^&key^=yourAPIKeyHere

Great, CLI Works! What’s next?

If everything works, you should see a flurry of JSON code scroll through the command prompt window – hooray! But here’s the rub, what do you with this information?

Some extra code will help us capture the output into a file:

curl https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url^=https://www.zldoty.com^&key^=yourAPIKeyHere -o fileName.json

Adding -o for output and specifying a file name will populate a table in CLI during processing with download progress stats. After the file has downloaded (sample runs took about 18 seconds), you should be able to see it in the default CLI directory in Windows File Explorer. (Typically C:\Users\youUserName)

Stay tuned for some more updates on how to scale this process across multiple URLs and extracting insights from the JSON file!

Actually Running Google Lighthouse Programmatically

Are you ready to run Google Lighthouse programmatically? This challenge has weighed on me for years. But with the help of some resources and friends, I’d love to show you a start-to-finish guide of truly running Google Lighthouse programmatically.

Programmatic Means Prep Work

First, my personal machine runs Windows 10, so my apologies to Mac & Linux users out there. Here’s what you need to get started, and huge shout-out to Upbuild for cutting the first part of the path here.

  • Install Node: https://nodejs.org/en/download/
  • After installation, search your start menu for “Command Prompt” or CMD
  • Assuming Node was installed correctly, and your command line is correctly configured, the following line in Command Prompt should install a Batch Lighthouse utility, a splendid bit of programming by Mike Stead
npm install lighthouse-batch -g

With this first round of prep, you can go get a list of URLs! For example, you might use Screaming Frog Command Line (CLI) automation for some pages.

JSON Files: Step 1 of Running Google Lighthouse Programmatically

The fastest way to run Lighthouse Batch programmatically is to use the -s method, shown below. Essentially the -s option allows you to paste in a comma-separated list of URLs to run.

Whether you’re working in Screaming Frog, or getting Google Search Console export data, you’ll want to use Convert.Town to transform column data into comma-separated values. (Hat tip again to Upbuild for this find.)

When you’re ready, open up command line, and let’s get Google Lighthouse running programmatically!

Here’s what your script should look like:

lighthouse-batch -s https://www.domain.com/page-1/, https://www.domain.com/page-2/, https://www.domain.com/page-3/, https://www.domain.com/page-4/, https://www.domain.com/page-5/ 

Hit enter, and watch it go! You may notice that the Command Line may take a few seconds to spool everything up before it starts running. After a few weeks of field-testing, I’ve been able to do up to 110 URLs consistently at a time without tripping an error.

You’ll see lots of wonderful code scrolling through the window, and after it finishes, you should have some JSON files in C:\Users\[YOUR USERNAME]\report\lighthouse

The Tricky Part of Running Google Lighthouse Programmatically

So if I ended this article here, it would be a poor summary of an Upbuild blog post. However! I found that while Google Sheets was indeed a brilliant solve, it didn’t meet my needs.

Here’s what I encountered in Google Sheets:

  • I may have broken some quotas for running scripts 🙂
  • The sheet took a long time to refresh queries
  • Because Google Sheets is less stable for large amounts of interactive data, it was slow to work with
  • Because the queries auto-refreshed, it wasn’t super handy for having a snapshot in time for later manipulation
  • Also, not being able to work with data locally (vs an FTP) didn’t go with my personal workflow very well

Again, huge ups to Upbuild for doing way more effort than I ever did on this! I just found that my needs diverged with the use case. Here’s what I did differently.

Larger Lighthouse Scale? Back to JSON

Here’s my bone to pick with trying run Lighthouse programmatically: JSON files make no sense to me as an SEO strategist and practitioner. (I get that JSON is great for developers for many reasons, but I’m not them.)

Sadly, Lighthouse only offers JSON or HTML reports, as far as I know. (If you know otherwise, please reach out! Would love to buy you a coffee. 🙂 ) So, that means you have to embrace the JSON.

I need tabular data, specifically, Excel exports to share with marketing and development teams for fixes from the report. So here’s where we are: we have a bunch of JSON files sitting on the hard drive.

This means command line batch script might be a good fit. Is it possible? How do we do it?

Command Line Batch Scripting to the Rescue!

In very simple terms:

  • We have information in File A (JSON) that needs to be extracted, and
  • Transformed into tabular (CSV) form, and
  • Both executed in a programmatic fashion

I unsuccessfully tried a number of manual batch script options, but if we go looking, there’s a solution to get us on our way! Meet JQ, a command line processor for JSON. Particularly, JQ can extract data from a JSON file and transform it into a CSV!

If we mesh JQ with some traditional batch scripting, we can achieve the requirements above.

A Little More Prep Work to Get Programmatic

You can download JQ here or at the link above (same URL.) There is some attention to detail here. In order to make JQ work properly in Command Line / Batch Script, you’ll need to copy the address of where you installed JQ and add it to your Path. Here’s a quick video showing you how to do it on Windows.

Once you have JQ successfully downloaded, installed and tested, it’s time to begin some scripting.

Programmatic Magic With Batch Script

Here’s both the fun and challenging part! In order to achieve a successful result, we need to lay out how our script will work in “psuedocode”. (Using pseudocode very loosely here as instructions to how the program should function.) Pseudocode follows.

Psuedocode for Programmatic Lighthouse Batch Script

For each JSON file (Sample JSON files attached) in directory C:\Users\yourusername\report\lighthouse, perform the following:

  • Extract the keys and values from this sub-section of each json file: audits > uses-rel-preload > details > items (We’re looking for url and wastedMS, and the values underneath)
  • Convert the extracted JSON data to a CSV, and place it in newly created folder named “uses-rel-preload” (without quotation marks) within C:\Users\yourusername\report\lighthouse. The CSV file should be named the same as the JSON file name (without the .json file extension)
  • Within the newly created CSV, insert two new columns to the left of “url” and “wastedMS”. Title the new column A “auditItem” and the new column B “sourceURL”.
  • In column A, populate “uses-rel-preload” (without qoutation marks) into each row where there are values in Columns C & D.
  • In column B, populate the CSV file name for each row where there are values in Columns C & D.
  • If there are no values in the JSON file from Step 1, create a new CSV with the JSON file name (no JSON file extension) in the new folder under C:\Users\yourusername\report\lighthouse (from step 2), with first column header “url”, second column header “wastedMS” and a single row below, with both cells containing “N/A”.
  • Within the newly created CSV, insert two new columns to the left of “url” and “wastedMS”. Title the new column A “auditItem” and the new column B “sourceURL”.
  • In column A, populate “uses-rel-preload” (without qoutation marks) into the first blank row.
  • In column B, populate the CSV file name into the first blank row.
  • Repeat/loop this action for every JSON file in the starting directory, but do not create a new folder in Step 2 for each following run, continue to use the same “uses-rel-preload” folder.
  • After the loop has completed for all files in the starting directory, Change directories to the new uses-rel-preload folder in Step 2. (C:\Users\yourusername\report\lighthouse\uses-rel-preload)
  • In that directory, combine all the CSV files (except for the header row in 2nd and subsequent CSV files, do not include the header row after the first file) into a new file titled uses-rel-preload.csv.

Batch Script for Programmatic Lighthouse Data Extraction

Alright, fire up your favorite text or code editor, and customize this to your needs! Personally, I will call the file via Command Prompt. Also, special thanks to Albin for his help making the JQ and CSV transformation.

:: Housekeeping and settings
TITLE
echo off
:: Enable replacing variables with values in the for loop
setlocal enabledelayedexpansion
:: Clear the screen
cls
:: Specify the working directory where the loop needs to perform
cd "C:\Users\yourusername\report\lighthouse"
:: For all the items (declare the working variable) in the set: active directory previously specified, all files matching *wildcard - all json files
for /f "tokens=*" %%A in ('dir /b /a-d *.json ') do ( 

	echo %%A
	set inputfile=%%A
	echo !inputfile!
	set inputfilename=%%~nA
	del !inputfilename!.csv_temp
	del !inputfilename!.csv

:: Parse the JSON with jq, extracting the specific details from the audit items and recommendations

:: Call JQ and a filter file that transforms the data into a temporary CSV file 
	type !inputfile! | jq .audits.\"uses-rel-preload\".details.items | jq -r -f filter.jq>>!inputfilename!.csv_temp

:: Create the first row of the CSV
	echo auditItem,sourceURL,url,wastedMs>!inputfilename!.csv

:: Check for empty files and provide fallback values if there are no recommendations
	for /f %%x in ("!inputfilename!.csv_temp") do set size=%%~zx
	if not !size! gtr 0 echo uses-rel-preload,!inputfile!,N/A,N/A>>!inputfilename!.csv

:: For all lines, make csv and add additional column
	for /F "skip=1 tokens=1,2 delims=," %%i in (!inputfilename!.csv_temp) do ( 			
			set url=%%i
			set wastedms=%%j
			set url=!url:"=!
			set wastedms=!wastedms:"=!
			echo uses-rel-preload,!inputfile!,!url!,!wastedms!>>!inputfilename!.csv
		)

:: Clear the temporary CSV files out of the working directory
	del !inputfilename!.csv_temp

	)

:: Make a new folder to house the files
mkdir uses-rel-preload


:: Move all CSV files into the audit type folder
move *.csv "uses-rel-preload"

:: Change the working directory
cd "C:\Users\yourusername\report\lighthouse\uses-rel-preload"

:: Start merging the individual files into a single CSV for the Lighthouse audit
:: Set a counter to 1
set cnt=1

:: For each file that matches *.csv, do the following loop
for %%i in (*.csv) do (
:: Call the counter 1 it's the first time running
  if !cnt!==1 (
:: Push the entire file complete with header into uses-rel-preload.csv - this will also create uses-rel-preload.csv
    for /f "delims=" %%j in ('type "%%i"') do echo %%j >> uses-rel-preload.csv
:: Otherwise, make sure we're not working with the uses-rel-preload.csv file and
  ) else if %%i NEQ uses-rel-preload.csv (
:: push the file without the header into uses-rel-preload.csv
    for /f "skip=1 delims=" %%j in ('type "%%i"') do echo %%j >> uses-rel-preload.csv
  )
REM increment count by 1
  set /a cnt+=1
)

Additionally, here’s the filter file referenced in the code.

def tocsv:
    (map(keys)
        |add
        |unique
        |sort
    ) as $cols
    |map(. as $row
        |$cols
        |map($row[.]|tostring)
    ) as $rows
    |$cols,$rows[]
    | @csv;

tocsv

Ok! There’s a lot going on. Let’s make sure a few important details aren’t missed:

  • This assumes that you left the JSON files in the default export location, C:\Users\yourusername\report\lighthouse\uses-rel-preload
  • This also assumes that you know how to run a batch script.
  • It’s not necessary per se, but for simplicity, I highly recommend you run and save the batch script files and resources all in the same directory as cited above
    • You’ll just have to be thoughtful about file path references if you spread resources/calls out.

The Output

Here’s what you should have:

  • The JSON files in the original directory
  • A new folder with the CSV files that contain URLs referenced in the Lighthouse uses-rel-preload report
    • The rows will contain N/A if the audit was empty, or had no results
  • Also within the folder should be a report-level CSV with all the CSVs containing all preload recommendations for all page you ran, combined into one sitewide file! Hooray!

Limitations on this Programmatic Lighthouse Approach

The astute will notice that this approach only covers one audit! Therein lies the downside to this approach, you’ll need to customize and program out additional code/files for other audits.

Assuming that you program out all the audits, you’ll want to create an audit summary folder within the first batch script file. This can be done with the mkdir directive in the script.

Additionally for each audit, you’ll want to copy the single overarching audit file into the audit summary folder. Using similar script as the last half of the above, you can combine all the audits into a single CSV that can easily be manipulated or ingested.