Glider
Loading...
Searching...
No Matches
src.importer.CreateSnapshots Namespace Reference

Functions

 get_data (file, bucket_out, path_out)
 
 group_by_field (data, fields, snapshot)
 
 upload_mongo (results, file_db_id, update, length)
 
 results_grouped (event, context)
 

Constants

 mongo_conn = mongo_connection()
 Class to connect to mongodb.
 
 collection = mongo_conn.mongo_conn_snapshots()
 Mongo collection to upload snapshots.
 
 ACCESS_ID
 Access keys for AWS.
 
 ACCESS_KEY
 Access keys for AWS.
 
 s3_session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY)
 Connection for AWS using boto3.
 

Function Documentation

◆ get_data()

src.importer.CreateSnapshots.get_data ( file,
bucket_out,
path_out )
Receives a S3 path and loads the data using awswrangler 

Args:
    file (str): current filename
    bucket_out (str): bucket where parquet file is storage
    path_out (str): s3 path where parquet file is storage
Returns: df (pandas dataframe)

Definition at line 31 of file CreateSnapshots.py.

Here is the caller graph for this function:

◆ group_by_field()

src.importer.CreateSnapshots.group_by_field ( data,
fields,
snapshot )
Group by service_id, territory_code, artists and tracks

Args:
    data (pandas dataframe): current parquet file loaded as dataframe
    fields (list): list of fields that are considered to snapshot
    snapshot (dict): it will contain all fields info
Returns: snap (dict)

Definition at line 44 of file CreateSnapshots.py.

Here is the caller graph for this function:

◆ results_grouped()

src.importer.CreateSnapshots.results_grouped ( event,
context )
Executes full procedure

Args:
    event (dict): is a dictionary with all client and sales information
    context (none): it's required just for lambda execution
Returns: (dict)

Definition at line 101 of file CreateSnapshots.py.

Here is the call graph for this function:

◆ upload_mongo()

src.importer.CreateSnapshots.upload_mongo ( results,
file_db_id,
update,
length )
Update fields using snapshot variable.

Args:
    results (dict): current parquet file loaded as dataframe
    file_db_id (str): list of fields that are considered to snapshot
    update (datetime stamp): the current datetime where snapshot is updated
    length (int): total rows
Returns: Nothing

Definition at line 86 of file CreateSnapshots.py.

Here is the caller graph for this function:

Constant Documentation

◆ ACCESS_ID

src.importer.CreateSnapshots.ACCESS_ID

Access keys for AWS.

Definition at line 26 of file CreateSnapshots.py.

◆ ACCESS_KEY

src.importer.CreateSnapshots.ACCESS_KEY

Access keys for AWS.

Definition at line 26 of file CreateSnapshots.py.

◆ collection

src.importer.CreateSnapshots.collection = mongo_conn.mongo_conn_snapshots()

Mongo collection to upload snapshots.

Definition at line 23 of file CreateSnapshots.py.

◆ mongo_conn

src.importer.CreateSnapshots.mongo_conn = mongo_connection()

Class to connect to mongodb.

Definition at line 21 of file CreateSnapshots.py.

◆ s3_session

src.importer.CreateSnapshots.s3_session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY)

Connection for AWS using boto3.

Definition at line 29 of file CreateSnapshots.py.