A Case Study - Helping to illustrate how it works

 

Your Requirement -

 

You have a data set containing the list of sales for the last 3 months – some 1,000,000+ records. Each record in the text file represents a sale defining - date of sale, sale amount, product barcodes, sales person, customer salutation, customer surname, customer email address.

 

You also have 2 more supporting data sets:

  • a text file that contains the customer first name, surname, gender, email address and cell phone number;

  • a text file that contains the brand, model, barcode, cost price and sales price;

 

You decide you want to send an SMS – offering a 15% discount – to each customer that bought a Brand A product in the last month (April) where total sale amount was greater than R1500.00.

 

At the same time the manufacturer of Brand A has asked for a report detailing the number of units of each model sold during the month of April –  and what percentage were sold to females.

 

Your data looks like this:

 

File 1

03-Apr-15,0000154322,9780201379624|9780201479642|9780201869635,JBloggs,Mr,Tmensah,mensa@gmail.com

 

File 2

Thomas,Mensah,male,mensa@gmail.com,+27793889990

 

File 3

Brand A                 ModelX                  978020186963500000600000000089999

 
Consolidating & Extraction
 

The data highlighted below will need to be extracted:

 

File 1

03-Apr-15,0000154322,9780201379624|9780201479642|9780201869635,JBloggs,Mr,Tmensah,mensa@gmail.com

 

File 2

Thomas,Mensah,male,mensa@gmail.com,+27793889990

 

File 3

Brand A                 ModelX                  978020186963500000600000000089999

 

Note:

  • In file 1 and 2 the fields are comma separated and are identified accordingly.

  • In file 3 each field occurs at a specific position in the record and adheres to a specific length. The fields are identified according to this pattern.

Merging & Transforming
 

The records that are common to all 3 data sets will be merged and transformed and output to a single temporary file.

 

File 1

03-Apr-15,0000154322,9780201379624|9780201479642|9780201869635,Mr,Tmensah,mensa@gmail.com

 

File 2

male,mensa@gmail.com,+27793889990

 

File 3

Brand A                 ModelX                  9780201869635

 

1st Temporary Output File

03-Apr-15,0000154322,Brand A                  ,ModelX                   ,Mr,T,mensah,male,+27793889990

Validation & Formatting
 

Specific fields of each record in the 1st temporary file are validated against a defined format. If validation fails, then action will be taken to format the offending field as required.

 

1st Temporary Output File

03-Apr-15,0000154322,Brand A                  ,ModelX                   ,Mr,T,mensah,male,+27793889990

 

2nd Temporary Output File

April;R1543,22;Brand A ;ModelX;Mr;T;Mensah;male;0027793889990

Sorting & Filtering
 

The 2nd temporary file is checked for duplicate records. Any duplicate records are removed and the data then sorted on the date field - ascending order to ensure that all records with a “date of sale” of April will be grouped together and presented at the top of the final output file.

 

The data will then be filtered and the data output to 2 final files – one to meet requirement 1 and the other requirement 2.

 

2nd Temporary Output File

April;R1543,22;Brand A ;ModelX;Mr;T;Mensah;male;0027793889990

 

Requirement 1 – Final Output File

April;R1543,22;Brand A;Mr;T;Mensah;0027793889990

{Filter: Where "product barcodes" = "Brand A" and "date of sale" = "April" and "sale amount" > R1500,00}

 

Requirement 2 – Final Output File

April;Brand A;ModelX;male

{Filter: Where “model" != “null" and "date of sale" = "April"}

Reporting
 

As per Requirement 2 – the Final Output File is translated and output as a report.

 

Requirement 2 – Final Output File

April;Brand A;ModelX;male

Automation
 

As per Requirement 1 – the Final Output File and report file is then emailed to an email address and / or uploaded to an SFTP directory of your choice.

 

You may want to run the same campaign for 6 consecutive months – in which case the Data-Crunchers process can be scheduled to automatically execute on the 1st of each month – pull the input files from an email or SFTP directory and output the files to an email address / SFTP directory.