Optimize activity report SQL

We only log FLOW-LOG-SIGNATURE-METADATA from one place- FlowRunner. As a
result, we can swap the generalized regex for a prefix-only regex, saving a
<strong>lot</strong> of processing for our epp query (which is the most
expensive of the bunch).

I've also changed the test dates from 2017-05 to 2017-06, allowing us to copy-paste
the test data into Bigquery to verify their function. The reason for 2017-06 in particular is because June was the first month that populated all the metadata necessary to generate these reports.

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=165391715
This commit is contained in:
larryruili 2017-08-15 19:11:05 -07:00 committed by Ben McIlwain
parent 9e7c996081
commit 38abe9fa48
10 changed files with 33 additions and 29 deletions

View file

@ -39,11 +39,11 @@ FROM (
-- Extract the logged JSON payload.
REGEXP_EXTRACT(logMessage, r'FLOW-LOG-SIGNATURE-METADATA: (.*)\n?$')
AS json
FROM `domain-registry-alpha.icann_reporting.monthly_logs_201705` AS logs
FROM `domain-registry-alpha.icann_reporting.monthly_logs_201706` AS logs
JOIN
UNNEST(logs.logMessage) AS logMessage
WHERE
logMessage LIKE "%FLOW-LOG-SIGNATURE-METADATA%")) AS regexes
STARTS_WITH(logMessage, "google.registry.flows.FlowReporter recordToLogs: FLOW-LOG-SIGNATURE-METADATA"))) AS regexes
JOIN
-- Unnest the JSON-parsed tlds.
UNNEST(regexes.tlds) AS tld