Glider
Loading...
Searching...
No Matches
src.importer.ProcessFiles Namespace Reference

Functions

 save_parquet_s3 (df, event, name)
 
 upload_data_mongo (df, filename)
 
 fix_date (date)
 
 get_period (date, filename)
 
 create_new_df (event, formats, return_dict)
 
 process_files_parallel (event, context=None)
 

Constants

list merlin_formats
 Merlin formats consider to make 1.5% discount.
 
 ACCESS_ID
 Access keys for AWS.
 
 ACCESS_KEY
 Access keys for AWS.
 
 mongo_conn = mongo_connection()
 Class to connect to mongodb.
 
 snap_collection = mongo_conn.mongo_conn_snapshots()
 Mongo collection to upload status procedure.
 
 collection_name = os.environ.get("COLLECTION")
 
 collection = mongo_conn.mongo_conn_sales()
 It's the mongo's collection for sales storage.
 
 s3_session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY)
 Connection for AWS using boto3.
 

Function Documentation

◆ create_new_df()

src.importer.ProcessFiles.create_new_df ( event,
formats,
return_dict )
Executes full procedure per format

Args:
    event (dict): is a dictionary with all client and sales information
    formats (str): current format to process
    return_dict (dict): is a dictionary with all processed files information

Returns: return_dict (dict)

Definition at line 128 of file ProcessFiles.py.

Here is the call graph for this function:

◆ fix_date()

src.importer.ProcessFiles.fix_date ( date)
Normalize date column to YYYY-MM-DD format

Args:
    date (datetime stamp): date column
Returns: date (str)

Definition at line 99 of file ProcessFiles.py.

Here is the caller graph for this function:

◆ get_period()

src.importer.ProcessFiles.get_period ( date,
filename )
Generates period given the date

Args:
    date (str): date column
Returns: period (str)

Definition at line 110 of file ProcessFiles.py.

Here is the caller graph for this function:

◆ process_files_parallel()

src.importer.ProcessFiles.process_files_parallel ( event,
context = None )
Executes full procedure using multiprocessing to process several formats at the same time

Args:
    event (dict): is a dictionary with all client and sales information
    context (none): it's required just for lambda execution
Returns: final_output (dict)

Definition at line 196 of file ProcessFiles.py.

◆ save_parquet_s3()

src.importer.ProcessFiles.save_parquet_s3 ( df,
event,
name )
Save current dataframe as parquet file in s3

Args:
    df (pandas dataframe): processed file loaded as dataframe
    event (dict): is a dictionary with all client and sales information
    name (str): current filename
Returns: response (str)

Definition at line 39 of file ProcessFiles.py.

Here is the caller graph for this function:

◆ upload_data_mongo()

src.importer.ProcessFiles.upload_data_mongo ( df,
filename )
 Uploads batch lines to mongodb

Args:
    df (dataframe): Dataframe with sales matched by catalogue
Returns: Nothing

Definition at line 69 of file ProcessFiles.py.

Here is the caller graph for this function:

Constant Documentation

◆ ACCESS_ID

src.importer.ProcessFiles.ACCESS_ID

Access keys for AWS.

Definition at line 25 of file ProcessFiles.py.

◆ ACCESS_KEY

src.importer.ProcessFiles.ACCESS_KEY

Access keys for AWS.

Definition at line 25 of file ProcessFiles.py.

◆ collection

src.importer.ProcessFiles.collection = mongo_conn.mongo_conn_sales()

It's the mongo's collection for sales storage.

Definition at line 34 of file ProcessFiles.py.

◆ collection_name

src.importer.ProcessFiles.collection_name = os.environ.get("COLLECTION")

Definition at line 32 of file ProcessFiles.py.

◆ merlin_formats

list src.importer.ProcessFiles.merlin_formats
Initial value:
1= ["akazoo","alibaba","anghami","AWA", "awa", "boomplay","deezer","iheart","jiosaavn","kkbox","mixcloud","netease","pandora","slacker","soundcloud",
2 "soundtrack_your_brand","spotify","tencent","tiktok","uma","yandex", "facebook", "roxi", "triller", "resso", "peloton","snapchat",
3 "jaxsta","trebel", "youtube_merlin", "vevo", "youtube_shorts", "youtube_merlin_label", "audiblemagic",
4 "facebook_revshare", "joox", "saavn", "tiktok-miniplayer", "kkbox_v2", "soundcloud_v2", "youtube_tier"]

Merlin formats consider to make 1.5% discount.

Definition at line 19 of file ProcessFiles.py.

◆ mongo_conn

src.importer.ProcessFiles.mongo_conn = mongo_connection()

Class to connect to mongodb.

Definition at line 28 of file ProcessFiles.py.

◆ s3_session

src.importer.ProcessFiles.s3_session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY)

Connection for AWS using boto3.

Definition at line 37 of file ProcessFiles.py.

◆ snap_collection

src.importer.ProcessFiles.snap_collection = mongo_conn.mongo_conn_snapshots()

Mongo collection to upload status procedure.

Definition at line 30 of file ProcessFiles.py.