Generate detail reports from Bigquery via Beam

This establishes a fully functional pipeline which generates detail reports for each registrar_tld pair from Bigquery. The main features:

1. Deserialization from AVRO GenericRecord (from Bigquery) into BillingEvent, a POJO we control. This is especially valuable to enable intrinsic type-safety at the start of the  pipeline.
2. Addition of .sql files containing the queries used to generate detail reports. These will later be templated to enable general usage.
3. Multi-file-writing within a single TextIO transform, which writes BillingEvents to different files based on their registrar_tld key combo.

This also upgrades the Beam core SDK referenced in repositories.bzl to 2.2.0 and returns the definitions to alphabetical order, to facilitate use of the check_bazel_deps.py script.

The final steps are:
- Converting this to a Nomulus command
- Templating the .sql queries
- @Injecting the @Config values for a given project

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=178124838
This commit is contained in:
larryruili 2017-12-06 11:19:18 -08:00 committed by jianglai
parent d736f7f08d
commit 735112def6
3 changed files with 395 additions and 399 deletions

View file

@ -67,13 +67,13 @@ public class GenerateInvoicesActionTest {
action.run();
LaunchTemplateParameters expectedParams =
new LaunchTemplateParameters()
.setJobName("test-bigquerytemplate1")
.setJobName("test-invoicing")
.setEnvironment(
new RuntimeEnvironment()
.setZone("us-east1-c")
.setTempLocation("gs://test-project-beam/temp"));
verify(templates).launch("test-project", expectedParams);
verify(launch).setGcsPath("gs://test-project-beam/templates/bigquery1");
verify(launch).setGcsPath("gs://test-project-beam/templates/invoicing");
assertThat(response.getStatus()).isEqualTo(200);
assertThat(response.getPayload()).isEqualTo("Launched dataflow template.");
}