linted and fixed

Signed-off-by: CocoByte <nicolle.leclair@gmail.com>
This commit is contained in:
CocoByte 2023-10-12 12:54:56 -06:00
parent 9ab908119d
commit 81cd4e900e
No known key found for this signature in database
GPG key ID: BBFAA2526384C97F
2 changed files with 66 additions and 65 deletions

View file

@ -1,11 +1,12 @@
# Registrar Data Migration
There is an existing registrar/registry at Verisign. They will provide us with an
export of the data from that system. The goal of our data migration is to take
the provided data and use it to create as much as possible a _matching_ state
The original system has an existing registrar/registry that we will import.
The company of that system will provide us with an export of the data.
The goal of our data migration is to take the provided data and use
it to create as much as possible a _matching_ state
in our registrar.
There is no way to make our registrar _identical_ to the Verisign system
There is no way to make our registrar _identical_ to the original system
because we have a different data model and workflow model. Instead, we should
focus our migration efforts on creating a state in our new registrar that will
primarily allow users of the system to perform the tasks that they want to do.
@ -18,7 +19,7 @@ Login.gov account can make an account on the new registrar, and the first time
that person logs in through Login.gov, we make a corresponding account in our
user table. Because we cannot know the Universal Unique ID (UUID) for a
person's Login.gov account, we cannot pre-create user accounts for individuals
in our new registrar based on the data from Verisign.
in our new registrar based on the original data.
## Domains
@ -27,7 +28,7 @@ information is the registry, but the registrar needs a copy of that
information to make connections between registry users and the domains that
they manage. The registrar stores very few fields about a domain except for
its name, so it could be straightforward to import the exported list of domains
from Verisign's `escrow_domains.daily.dotgov.GOV.txt`. It doesn't appear that
from `escrow_domains.daily.dotgov.GOV.txt`. It doesn't appear that
that table stores a flag for active or inactive.
An example Django management command that can load the delimited text file
@ -43,7 +44,7 @@ docker compose run -T app ./manage.py load_domains_data < /tmp/escrow_domains.da
## User access to domains
The Verisign data contains a `escrow_domain_contacts.daily.dotgov.txt` file
The data export contains a `escrow_domain_contacts.daily.dotgov.txt` file
that links each domain to three different types of contacts: `billing`,
`tech`, and `admin`. The ID of the contact in this linking table corresponds
to the ID of a contact in the `escrow_contacts.daily.dotgov.txt` file. In the
@ -59,9 +60,9 @@ invitation's domain.
For the purposes of migration, we can prime the invitation system by creating
an invitation in the system for each email address listed in the
`domain_contacts` file. This means that if a person is currently a user in the
Verisign system, and they use the same email address with Login.gov, then they
original system, and they use the same email address with Login.gov, then they
will end up with access to the same domains in the new registrar that they
were associated with in the Verisign system.
were associated with in the original system.
A management command that does this needs to process two data files, one for
the contact information and one for the domain/contact association, so we
@ -78,17 +79,18 @@ docker compose run app ./manage.py load_domain_invitations /app/escrow_domain_co
```
## Transition Domains
Verisign provides information about Transition Domains in 3 files;
FILE 1: escrow_domain_contacts.daily.gov.GOV.txt
FILE 2: escrow_contacts.daily.gov.GOV.txt
FILE 3: escrow_domain_statuses.daily.gov.GOV.txt
We are provided with information about Transition Domains in 3 files:
FILE 1: **escrow_domain_contacts.daily.gov.GOV.txt** -> has the map of domain names to contact ID. Domains in this file will usually have 3 contacts each
FILE 2: **escrow_contacts.daily.gov.GOV.txt** -> has the mapping of contact id to contact email address (which is what we care about for sending domain invitations)
FILE 3: **escrow_domain_statuses.daily.gov.GOV.txt** -> has the map of domains and their statuses
Transferring this data from these files into our domain tables happens in two steps;
***IMPORTANT: only run the following locally, to avoid publicizing PII in our public repo.***
### STEP 1: Load Transition Domain data into TransitionDomain table
**SETUP**
In order to use the management command, we need to add the files to a folder under `src/`.
This will allow Docker to mount the files to a container (under `/app`) for our use.

View file

@ -1,5 +1,3 @@
"""Load domain invitations for existing domains and their contacts."""
import sys
import csv
import logging
@ -193,10 +191,11 @@ class Command(BaseCommand):
total_duplicate_domains = len(duplicate_domains)
total_users_without_email = len(users_without_email)
if total_users_without_email > 0:
users_without_email_as_string = "{}".format(
", ".join(map(str, duplicate_domain_user_combos))
)
logger.warning(
"No e-mails found for users: {}".format(
", ".join(map(str, users_without_email))
)
f"{termColors.YELLOW} No e-mails found for users: {users_without_email_as_string}" # noqa
)
if total_duplicate_pairs > 0 or total_duplicate_domains > 0:
duplicate_pairs_as_string = "{}".format(
@ -267,7 +266,6 @@ class Command(BaseCommand):
{domains_without_status_as_string}
{termColors.ENDC}"""
)
def print_debug(self, print_condition: bool, print_statement: str):
"""This function reduces complexity of debug statements
@ -278,8 +276,7 @@ class Command(BaseCommand):
if print_condition:
logger.info(print_statement)
def prompt_table_reset():
def prompt_table_reset(self):
"""Brings up a prompt in the terminal asking
if the user wishes to delete data in the
TransitionDomain table. If the user confirms,
@ -300,7 +297,7 @@ class Command(BaseCommand):
)
TransitionDomain.objects.all().delete()
def handle(
def handle( # noqa: C901
self,
domain_contacts_filename,
contacts_filename,
@ -382,24 +379,24 @@ class Command(BaseCommand):
# PART 1: Get the status
if new_entry_domain_name not in domain_status_dictionary:
# This domain has no status...default to "Create"
# (For data analysis purposes, add domain name
# to list of all domains without status
# (For data analysis purposes, add domain name
# to list of all domains without status
# (avoid duplicate entries))
if new_entry_domain_name not in domains_without_status:
domains_without_status.append(new_entry_domain_name)
else:
# Map the status
# Map the status
original_status = domain_status_dictionary[new_entry_domain_name]
mapped_status = self.get_mapped_status(original_status)
if mapped_status is None:
# (For data analysis purposes, check for any statuses
# that don't have a mapping and add to list
# (For data analysis purposes, check for any statuses
# that don't have a mapping and add to list
# of "outlier statuses")
logger.info("Unknown status: " + original_status)
outlier_statuses.append(original_status)
else:
new_entry_status = mapped_status
# PART 2: Get the e-mail
if user_id not in user_emails_dictionary:
# this user has no e-mail...this should never happen
@ -433,27 +430,28 @@ class Command(BaseCommand):
# DEBUG:
self.print_debug(
debug_on,
f"{termColors.YELLOW} DUPLICATE Verisign entries found for domain: {new_entry_domain_name} {termColors.ENDC}" # noqa
)
f"{termColors.YELLOW} DUPLICATE file entries found for domain: {new_entry_domain_name} {termColors.ENDC}", # noqa
)
if new_entry_domain_name not in duplicate_domains:
duplicate_domains.append(new_entry_domain_name)
if existing_domain_user_pair is not None:
# DEBUG:
self.print_debug(
debug_on,
f"""{termColors.YELLOW} DUPLICATE Verisign entries found for domain - user {termColors.BackgroundLightYellow} PAIR {termColors.ENDC}{termColors.YELLOW}:
{new_entry_domain_name} - {new_entry_email} {termColors.ENDC}""" # noqa
)
f"""{termColors.YELLOW} DUPLICATE file entries found for domain - user {termColors.BackgroundLightYellow} PAIR {termColors.ENDC}{termColors.YELLOW}:
{new_entry_domain_name} - {new_entry_email} {termColors.ENDC}""", # noqa
)
if existing_domain_user_pair not in duplicate_domain_user_combos:
duplicate_domain_user_combos.append(existing_domain_user_pair)
else:
try:
entry_exists = TransitionDomain.objects.exists(
username=new_entry_email, domain_name=new_entry_domain_name
)
if(entry_exists):
entry_exists = TransitionDomain.objects.filter(
username=new_entry_email, domain_name=new_entry_domain_name
).exists()
if entry_exists:
try:
existing_entry = TransitionDomain.objects.get(
username=new_entry_email, domain_name=new_entry_domain_name
username=new_entry_email,
domain_name=new_entry_domain_name,
)
if existing_entry.status != new_entry_status:
@ -464,13 +462,21 @@ class Command(BaseCommand):
f"Updating entry: {existing_entry}"
f"Status: {existing_entry.status} > {new_entry_status}" # noqa
f"Email Sent: {existing_entry.email_sent} > {new_entry_emailSent}" # noqa
f"{termColors.ENDC}"
)
f"{termColors.ENDC}",
)
existing_entry.status = new_entry_status
existing_entry.email_sent = new_entry_emailSent
existing_entry.save()
except TransitionDomain.DoesNotExist:
existing_entry.email_sent = new_entry_emailSent
existing_entry.save()
except TransitionDomain.MultipleObjectsReturned:
logger.info(
f"{termColors.FAIL}"
f"!!! ERROR: duplicate entries exist in the"
f"transtion_domain table for domain:"
f"{new_entry_domain_name}"
f"----------TERMINATING----------"
)
sys.exit()
else:
# no matching entry, make one
new_entry = TransitionDomain(
username=new_entry_email,
@ -484,27 +490,20 @@ class Command(BaseCommand):
# DEBUG:
self.print_debug(
debug_on,
f"{termColors.OKCYAN} Adding entry {total_new_entries}: {new_entry} {termColors.ENDC}" # noqa
f"{termColors.OKCYAN} Adding entry {total_new_entries}: {new_entry} {termColors.ENDC}", # noqa
)
except TransitionDomain.MultipleObjectsReturned:
logger.info(
f"{termColors.FAIL}"
f"!!! ERROR: duplicate entries exist in the"
f"transtion_domain table for domain:"
f"{new_entry_domain_name}"
f"----------TERMINATING----------"
)
sys.exit()
# DEBUG:
if (total_rows_parsed >= debug_max_entries_to_parse
and debug_max_entries_to_parse != 0):
logger.info(
f"{termColors.YELLOW}"
f"----PARSE LIMIT REACHED. HALTING PARSER.----"
f"{termColors.ENDC}"
)
break
# Check Parse limit and exit loop if needed
if (
total_rows_parsed >= debug_max_entries_to_parse
and debug_max_entries_to_parse != 0
):
logger.info(
f"{termColors.YELLOW}"
f"----PARSE LIMIT REACHED. HALTING PARSER.----"
f"{termColors.ENDC}"
)
break
TransitionDomain.objects.bulk_create(to_create)