Glider
Loading...
Searching...
No Matches
src.importer.IdentifyFormat Namespace Reference

Functions

 download_obj (event, file)
 
 csvHeaders (event, file)
 
 identifyHeaders (headers, collection)
 
 type_schema (schema)
 
 cols_otto (schema, ottoMapping)
 
 identify_format (event, context=None)
 

Constants

 s3_file_obj = None
 It will contain the file content downloaded from s3.
 
 ENV = os.environ.get("ENVIRONMENT")
 Environment where code is running (Development/Production)
 
 ACCESS_ID
 Access keys for AWS.
 
 ACCESS_KEY
 Access keys for AWS.
 
 mongo_conn = mongo_connection()
 Class to connect to mongodb.
 
 collection = mongo_conn.mongo_conn_formats()
 Mongo collection to search formats.
 
 snap_collection = mongo_conn.mongo_conn_snapshots()
 Mongo collection to upload snapshots.
 
 s3_client = boto3.client("s3", aws_access_key_id=ACCESS_ID, aws_secret_access_key= ACCESS_KEY)
 Connection for AWS using boto3.
 

Function Documentation

◆ cols_otto()

src.importer.IdentifyFormat.cols_otto ( schema,
ottoMapping )
Builds the relation between main features of each column in the file and the mapping template

Args:
    schema (list): contains the columns features from current file
    ottoMapping (list): contains the columns features from the desired template 
Returns: match (list)

Definition at line 122 of file IdentifyFormat.py.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ csvHeaders()

src.importer.IdentifyFormat.csvHeaders ( event,
file )
Takes the s3_file content and decoding it to get headers

Args:
    event (dict): is a dictionary with all client and sales information
    file (str): current filename
Returns: headers (list)

Definition at line 58 of file IdentifyFormat.py.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ download_obj()

src.importer.IdentifyFormat.download_obj ( event,
file )
Receives file info and reads the firsts 1024 bytes to determinate the encoding

Args:
    event (dict): is a dictionary with all client and sales information
    file (str): current filename
Returns: s3_file (str)
         charenc (str)

Definition at line 39 of file IdentifyFormat.py.

Here is the caller graph for this function:

◆ identify_format()

src.importer.IdentifyFormat.identify_format ( event,
context = None )
Executes full procedure filter csv, txt and xls files

Args:
    event (dict): is a dictionary with all client and sales information
    context (none): it's required just for lambda execution
Returns: (dict)

Definition at line 143 of file IdentifyFormat.py.

Here is the call graph for this function:

◆ identifyHeaders()

src.importer.IdentifyFormat.identifyHeaders ( headers,
collection )
Searches format using headers and takes template information. Also builds a list with main features of each column

Args:
    headers (list): contains the X firsts lines from current file
    collection (mongo collection): Mongo collection where formats templates are storage
Returns: template_format (dict)

Definition at line 84 of file IdentifyFormat.py.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ type_schema()

src.importer.IdentifyFormat.type_schema ( schema)
Takes the schema field from database and gets the main features of each column

Args:
    schema (list): contains the columns features from current file
    collection (mongo collection): Mongo collection where formats templates are storage
Returns: template_format (dict)

Definition at line 105 of file IdentifyFormat.py.

Here is the caller graph for this function:

Constant Documentation

◆ ACCESS_ID

src.importer.IdentifyFormat.ACCESS_ID

Access keys for AWS.

Definition at line 27 of file IdentifyFormat.py.

◆ ACCESS_KEY

src.importer.IdentifyFormat.ACCESS_KEY

Access keys for AWS.

Definition at line 27 of file IdentifyFormat.py.

◆ collection

src.importer.IdentifyFormat.collection = mongo_conn.mongo_conn_formats()

Mongo collection to search formats.

Definition at line 32 of file IdentifyFormat.py.

◆ ENV

src.importer.IdentifyFormat.ENV = os.environ.get("ENVIRONMENT")

Environment where code is running (Development/Production)

Definition at line 23 of file IdentifyFormat.py.

◆ mongo_conn

src.importer.IdentifyFormat.mongo_conn = mongo_connection()

Class to connect to mongodb.

Definition at line 30 of file IdentifyFormat.py.

◆ s3_client

src.importer.IdentifyFormat.s3_client = boto3.client("s3", aws_access_key_id=ACCESS_ID, aws_secret_access_key= ACCESS_KEY)

Connection for AWS using boto3.

Definition at line 37 of file IdentifyFormat.py.

◆ s3_file_obj

src.importer.IdentifyFormat.s3_file_obj = None

It will contain the file content downloaded from s3.

Definition at line 20 of file IdentifyFormat.py.

◆ snap_collection

src.importer.IdentifyFormat.snap_collection = mongo_conn.mongo_conn_snapshots()

Mongo collection to upload snapshots.

Definition at line 34 of file IdentifyFormat.py.