Glider
|
Functions | |
save_parquet_s3 (df, event, name) | |
upload_data_mongo (df, filename) | |
fix_date (date) | |
get_period (date, filename) | |
create_new_df (event, formats, return_dict) | |
process_files_parallel (event, context=None) | |
Constants | |
list | merlin_formats |
Merlin formats consider to make 1.5% discount. | |
ACCESS_ID | |
Access keys for AWS. | |
ACCESS_KEY | |
Access keys for AWS. | |
mongo_conn = mongo_connection() | |
Class to connect to mongodb. | |
snap_collection = mongo_conn.mongo_conn_snapshots() | |
Mongo collection to upload status procedure. | |
collection_name = os.environ.get("COLLECTION") | |
collection = mongo_conn.mongo_conn_sales() | |
It's the mongo's collection for sales storage. | |
s3_session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY) | |
Connection for AWS using boto3. | |
src.importer.ProcessFiles.create_new_df | ( | event, | |
formats, | |||
return_dict ) |
Executes full procedure per format Args: event (dict): is a dictionary with all client and sales information formats (str): current format to process return_dict (dict): is a dictionary with all processed files information Returns: return_dict (dict)
Definition at line 128 of file ProcessFiles.py.
src.importer.ProcessFiles.fix_date | ( | date | ) |
Normalize date column to YYYY-MM-DD format Args: date (datetime stamp): date column Returns: date (str)
Definition at line 99 of file ProcessFiles.py.
src.importer.ProcessFiles.get_period | ( | date, | |
filename ) |
Generates period given the date Args: date (str): date column Returns: period (str)
Definition at line 110 of file ProcessFiles.py.
src.importer.ProcessFiles.process_files_parallel | ( | event, | |
context = None ) |
Executes full procedure using multiprocessing to process several formats at the same time Args: event (dict): is a dictionary with all client and sales information context (none): it's required just for lambda execution Returns: final_output (dict)
Definition at line 196 of file ProcessFiles.py.
src.importer.ProcessFiles.save_parquet_s3 | ( | df, | |
event, | |||
name ) |
Save current dataframe as parquet file in s3 Args: df (pandas dataframe): processed file loaded as dataframe event (dict): is a dictionary with all client and sales information name (str): current filename Returns: response (str)
Definition at line 39 of file ProcessFiles.py.
src.importer.ProcessFiles.upload_data_mongo | ( | df, | |
filename ) |
Uploads batch lines to mongodb Args: df (dataframe): Dataframe with sales matched by catalogue Returns: Nothing
Definition at line 69 of file ProcessFiles.py.
src.importer.ProcessFiles.ACCESS_ID |
Access keys for AWS.
Definition at line 25 of file ProcessFiles.py.
src.importer.ProcessFiles.ACCESS_KEY |
Access keys for AWS.
Definition at line 25 of file ProcessFiles.py.
src.importer.ProcessFiles.collection = mongo_conn.mongo_conn_sales() |
It's the mongo's collection for sales storage.
Definition at line 34 of file ProcessFiles.py.
src.importer.ProcessFiles.collection_name = os.environ.get("COLLECTION") |
Definition at line 32 of file ProcessFiles.py.
list src.importer.ProcessFiles.merlin_formats |
Merlin formats consider to make 1.5% discount.
Definition at line 19 of file ProcessFiles.py.
src.importer.ProcessFiles.mongo_conn = mongo_connection() |
Class to connect to mongodb.
Definition at line 28 of file ProcessFiles.py.
src.importer.ProcessFiles.s3_session = boto3.Session(aws_access_key_id=ACCESS_ID, aws_secret_access_key=ACCESS_KEY) |
Connection for AWS using boto3.
Definition at line 37 of file ProcessFiles.py.
src.importer.ProcessFiles.snap_collection = mongo_conn.mongo_conn_snapshots() |
Mongo collection to upload status procedure.
Definition at line 30 of file ProcessFiles.py.