linted and fixed

Signed-off-by: CocoByte <nicolle.leclair@gmail.com>
This commit is contained in:
CocoByte 2023-10-12 12:54:56 -06:00
parent 9ab908119d
commit 81cd4e900e
No known key found for this signature in database
GPG key ID: BBFAA2526384C97F
2 changed files with 66 additions and 65 deletions

View file

@ -1,11 +1,12 @@
# Registrar Data Migration # Registrar Data Migration
There is an existing registrar/registry at Verisign. They will provide us with an The original system has an existing registrar/registry that we will import.
export of the data from that system. The goal of our data migration is to take The company of that system will provide us with an export of the data.
the provided data and use it to create as much as possible a _matching_ state The goal of our data migration is to take the provided data and use
it to create as much as possible a _matching_ state
in our registrar. in our registrar.
There is no way to make our registrar _identical_ to the Verisign system There is no way to make our registrar _identical_ to the original system
because we have a different data model and workflow model. Instead, we should because we have a different data model and workflow model. Instead, we should
focus our migration efforts on creating a state in our new registrar that will focus our migration efforts on creating a state in our new registrar that will
primarily allow users of the system to perform the tasks that they want to do. primarily allow users of the system to perform the tasks that they want to do.
@ -18,7 +19,7 @@ Login.gov account can make an account on the new registrar, and the first time
that person logs in through Login.gov, we make a corresponding account in our that person logs in through Login.gov, we make a corresponding account in our
user table. Because we cannot know the Universal Unique ID (UUID) for a user table. Because we cannot know the Universal Unique ID (UUID) for a
person's Login.gov account, we cannot pre-create user accounts for individuals person's Login.gov account, we cannot pre-create user accounts for individuals
in our new registrar based on the data from Verisign. in our new registrar based on the original data.
## Domains ## Domains
@ -27,7 +28,7 @@ information is the registry, but the registrar needs a copy of that
information to make connections between registry users and the domains that information to make connections between registry users and the domains that
they manage. The registrar stores very few fields about a domain except for they manage. The registrar stores very few fields about a domain except for
its name, so it could be straightforward to import the exported list of domains its name, so it could be straightforward to import the exported list of domains
from Verisign's `escrow_domains.daily.dotgov.GOV.txt`. It doesn't appear that from `escrow_domains.daily.dotgov.GOV.txt`. It doesn't appear that
that table stores a flag for active or inactive. that table stores a flag for active or inactive.
An example Django management command that can load the delimited text file An example Django management command that can load the delimited text file
@ -43,7 +44,7 @@ docker compose run -T app ./manage.py load_domains_data < /tmp/escrow_domains.da
## User access to domains ## User access to domains
The Verisign data contains a `escrow_domain_contacts.daily.dotgov.txt` file The data export contains a `escrow_domain_contacts.daily.dotgov.txt` file
that links each domain to three different types of contacts: `billing`, that links each domain to three different types of contacts: `billing`,
`tech`, and `admin`. The ID of the contact in this linking table corresponds `tech`, and `admin`. The ID of the contact in this linking table corresponds
to the ID of a contact in the `escrow_contacts.daily.dotgov.txt` file. In the to the ID of a contact in the `escrow_contacts.daily.dotgov.txt` file. In the
@ -59,9 +60,9 @@ invitation's domain.
For the purposes of migration, we can prime the invitation system by creating For the purposes of migration, we can prime the invitation system by creating
an invitation in the system for each email address listed in the an invitation in the system for each email address listed in the
`domain_contacts` file. This means that if a person is currently a user in the `domain_contacts` file. This means that if a person is currently a user in the
Verisign system, and they use the same email address with Login.gov, then they original system, and they use the same email address with Login.gov, then they
will end up with access to the same domains in the new registrar that they will end up with access to the same domains in the new registrar that they
were associated with in the Verisign system. were associated with in the original system.
A management command that does this needs to process two data files, one for A management command that does this needs to process two data files, one for
the contact information and one for the domain/contact association, so we the contact information and one for the domain/contact association, so we
@ -78,17 +79,18 @@ docker compose run app ./manage.py load_domain_invitations /app/escrow_domain_co
``` ```
## Transition Domains ## Transition Domains
Verisign provides information about Transition Domains in 3 files; We are provided with information about Transition Domains in 3 files:
FILE 1: escrow_domain_contacts.daily.gov.GOV.txt FILE 1: **escrow_domain_contacts.daily.gov.GOV.txt** -> has the map of domain names to contact ID. Domains in this file will usually have 3 contacts each
FILE 2: escrow_contacts.daily.gov.GOV.txt FILE 2: **escrow_contacts.daily.gov.GOV.txt** -> has the mapping of contact id to contact email address (which is what we care about for sending domain invitations)
FILE 3: escrow_domain_statuses.daily.gov.GOV.txt FILE 3: **escrow_domain_statuses.daily.gov.GOV.txt** -> has the map of domains and their statuses
Transferring this data from these files into our domain tables happens in two steps; Transferring this data from these files into our domain tables happens in two steps;
***IMPORTANT: only run the following locally, to avoid publicizing PII in our public repo.***
### STEP 1: Load Transition Domain data into TransitionDomain table ### STEP 1: Load Transition Domain data into TransitionDomain table
**SETUP** **SETUP**
In order to use the management command, we need to add the files to a folder under `src/`. In order to use the management command, we need to add the files to a folder under `src/`.
This will allow Docker to mount the files to a container (under `/app`) for our use. This will allow Docker to mount the files to a container (under `/app`) for our use.

View file

@ -1,5 +1,3 @@
"""Load domain invitations for existing domains and their contacts."""
import sys import sys
import csv import csv
import logging import logging
@ -193,10 +191,11 @@ class Command(BaseCommand):
total_duplicate_domains = len(duplicate_domains) total_duplicate_domains = len(duplicate_domains)
total_users_without_email = len(users_without_email) total_users_without_email = len(users_without_email)
if total_users_without_email > 0: if total_users_without_email > 0:
users_without_email_as_string = "{}".format(
", ".join(map(str, duplicate_domain_user_combos))
)
logger.warning( logger.warning(
"No e-mails found for users: {}".format( f"{termColors.YELLOW} No e-mails found for users: {users_without_email_as_string}" # noqa
", ".join(map(str, users_without_email))
)
) )
if total_duplicate_pairs > 0 or total_duplicate_domains > 0: if total_duplicate_pairs > 0 or total_duplicate_domains > 0:
duplicate_pairs_as_string = "{}".format( duplicate_pairs_as_string = "{}".format(
@ -267,7 +266,6 @@ class Command(BaseCommand):
{domains_without_status_as_string} {domains_without_status_as_string}
{termColors.ENDC}""" {termColors.ENDC}"""
) )
def print_debug(self, print_condition: bool, print_statement: str): def print_debug(self, print_condition: bool, print_statement: str):
"""This function reduces complexity of debug statements """This function reduces complexity of debug statements
@ -278,8 +276,7 @@ class Command(BaseCommand):
if print_condition: if print_condition:
logger.info(print_statement) logger.info(print_statement)
def prompt_table_reset(self):
def prompt_table_reset():
"""Brings up a prompt in the terminal asking """Brings up a prompt in the terminal asking
if the user wishes to delete data in the if the user wishes to delete data in the
TransitionDomain table. If the user confirms, TransitionDomain table. If the user confirms,
@ -300,7 +297,7 @@ class Command(BaseCommand):
) )
TransitionDomain.objects.all().delete() TransitionDomain.objects.all().delete()
def handle( def handle( # noqa: C901
self, self,
domain_contacts_filename, domain_contacts_filename,
contacts_filename, contacts_filename,
@ -382,24 +379,24 @@ class Command(BaseCommand):
# PART 1: Get the status # PART 1: Get the status
if new_entry_domain_name not in domain_status_dictionary: if new_entry_domain_name not in domain_status_dictionary:
# This domain has no status...default to "Create" # This domain has no status...default to "Create"
# (For data analysis purposes, add domain name # (For data analysis purposes, add domain name
# to list of all domains without status # to list of all domains without status
# (avoid duplicate entries)) # (avoid duplicate entries))
if new_entry_domain_name not in domains_without_status: if new_entry_domain_name not in domains_without_status:
domains_without_status.append(new_entry_domain_name) domains_without_status.append(new_entry_domain_name)
else: else:
# Map the status # Map the status
original_status = domain_status_dictionary[new_entry_domain_name] original_status = domain_status_dictionary[new_entry_domain_name]
mapped_status = self.get_mapped_status(original_status) mapped_status = self.get_mapped_status(original_status)
if mapped_status is None: if mapped_status is None:
# (For data analysis purposes, check for any statuses # (For data analysis purposes, check for any statuses
# that don't have a mapping and add to list # that don't have a mapping and add to list
# of "outlier statuses") # of "outlier statuses")
logger.info("Unknown status: " + original_status) logger.info("Unknown status: " + original_status)
outlier_statuses.append(original_status) outlier_statuses.append(original_status)
else: else:
new_entry_status = mapped_status new_entry_status = mapped_status
# PART 2: Get the e-mail # PART 2: Get the e-mail
if user_id not in user_emails_dictionary: if user_id not in user_emails_dictionary:
# this user has no e-mail...this should never happen # this user has no e-mail...this should never happen
@ -433,27 +430,28 @@ class Command(BaseCommand):
# DEBUG: # DEBUG:
self.print_debug( self.print_debug(
debug_on, debug_on,
f"{termColors.YELLOW} DUPLICATE Verisign entries found for domain: {new_entry_domain_name} {termColors.ENDC}" # noqa f"{termColors.YELLOW} DUPLICATE file entries found for domain: {new_entry_domain_name} {termColors.ENDC}", # noqa
) )
if new_entry_domain_name not in duplicate_domains: if new_entry_domain_name not in duplicate_domains:
duplicate_domains.append(new_entry_domain_name) duplicate_domains.append(new_entry_domain_name)
if existing_domain_user_pair is not None: if existing_domain_user_pair is not None:
# DEBUG: # DEBUG:
self.print_debug( self.print_debug(
debug_on, debug_on,
f"""{termColors.YELLOW} DUPLICATE Verisign entries found for domain - user {termColors.BackgroundLightYellow} PAIR {termColors.ENDC}{termColors.YELLOW}: f"""{termColors.YELLOW} DUPLICATE file entries found for domain - user {termColors.BackgroundLightYellow} PAIR {termColors.ENDC}{termColors.YELLOW}:
{new_entry_domain_name} - {new_entry_email} {termColors.ENDC}""" # noqa {new_entry_domain_name} - {new_entry_email} {termColors.ENDC}""", # noqa
) )
if existing_domain_user_pair not in duplicate_domain_user_combos: if existing_domain_user_pair not in duplicate_domain_user_combos:
duplicate_domain_user_combos.append(existing_domain_user_pair) duplicate_domain_user_combos.append(existing_domain_user_pair)
else: else:
try: entry_exists = TransitionDomain.objects.filter(
entry_exists = TransitionDomain.objects.exists( username=new_entry_email, domain_name=new_entry_domain_name
username=new_entry_email, domain_name=new_entry_domain_name ).exists()
) if entry_exists:
if(entry_exists): try:
existing_entry = TransitionDomain.objects.get( existing_entry = TransitionDomain.objects.get(
username=new_entry_email, domain_name=new_entry_domain_name username=new_entry_email,
domain_name=new_entry_domain_name,
) )
if existing_entry.status != new_entry_status: if existing_entry.status != new_entry_status:
@ -464,13 +462,21 @@ class Command(BaseCommand):
f"Updating entry: {existing_entry}" f"Updating entry: {existing_entry}"
f"Status: {existing_entry.status} > {new_entry_status}" # noqa f"Status: {existing_entry.status} > {new_entry_status}" # noqa
f"Email Sent: {existing_entry.email_sent} > {new_entry_emailSent}" # noqa f"Email Sent: {existing_entry.email_sent} > {new_entry_emailSent}" # noqa
f"{termColors.ENDC}" f"{termColors.ENDC}",
) )
existing_entry.status = new_entry_status existing_entry.status = new_entry_status
existing_entry.email_sent = new_entry_emailSent
existing_entry.email_sent = new_entry_emailSent existing_entry.save()
existing_entry.save() except TransitionDomain.MultipleObjectsReturned:
except TransitionDomain.DoesNotExist: logger.info(
f"{termColors.FAIL}"
f"!!! ERROR: duplicate entries exist in the"
f"transtion_domain table for domain:"
f"{new_entry_domain_name}"
f"----------TERMINATING----------"
)
sys.exit()
else:
# no matching entry, make one # no matching entry, make one
new_entry = TransitionDomain( new_entry = TransitionDomain(
username=new_entry_email, username=new_entry_email,
@ -484,27 +490,20 @@ class Command(BaseCommand):
# DEBUG: # DEBUG:
self.print_debug( self.print_debug(
debug_on, debug_on,
f"{termColors.OKCYAN} Adding entry {total_new_entries}: {new_entry} {termColors.ENDC}" # noqa f"{termColors.OKCYAN} Adding entry {total_new_entries}: {new_entry} {termColors.ENDC}", # noqa
) )
except TransitionDomain.MultipleObjectsReturned:
logger.info(
f"{termColors.FAIL}"
f"!!! ERROR: duplicate entries exist in the"
f"transtion_domain table for domain:"
f"{new_entry_domain_name}"
f"----------TERMINATING----------"
)
sys.exit()
# DEBUG: # Check Parse limit and exit loop if needed
if (total_rows_parsed >= debug_max_entries_to_parse if (
and debug_max_entries_to_parse != 0): total_rows_parsed >= debug_max_entries_to_parse
logger.info( and debug_max_entries_to_parse != 0
f"{termColors.YELLOW}" ):
f"----PARSE LIMIT REACHED. HALTING PARSER.----" logger.info(
f"{termColors.ENDC}" f"{termColors.YELLOW}"
) f"----PARSE LIMIT REACHED. HALTING PARSER.----"
break f"{termColors.ENDC}"
)
break
TransitionDomain.objects.bulk_create(to_create) TransitionDomain.objects.bulk_create(to_create)