0% found this document useful (0 votes)
67 views25 pages

Ingram Document

Uploaded by

Jaya Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views25 pages

Ingram Document

Uploaded by

Jaya Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Ingram

File logic

· Ideal Case - One full daily with all full delta files

StockV2 Full

StockV2 Delta

· Second Option - Part One and Part Two full files once weekly with daily delta files or One full
weekly with once daily delta files

Maintaining the Data

· Download the first Full file one time only and all sequential Delta files.

· Only show titles that are currently available to sell (or in stock)

· Show titles that are temporarily out of stock (OS) and can be backordered in the La Vergne
DC.

· Show titles that can be pre-ordered or that have not yet published / released (NYP / NYR).

· Determine if a title is not returnable.

· Restricted titles and country exclusion are very imp

Position 1 - GTN - 0 – means books – need to check what else we are getting

15to28 - UPC = we need to remove 2 zero’s from front and save it

39to45 + 46to52 + 53to59 + 60to66 + 67to73 + 74to80 + 81+87 ----- We need to add and save it
as total Qty on hand.

88to94 --- they have mentioned as for future use, so ideally it should be 000000, can we verify. If
it is 000000, then we can add it in above logic so that when they add new warehouse we don’t
have to change the logic for On Hand Qty.
166to168 - Discount level

REG Trade Discount – 40%

Other titles could be sold as Short, Long or Net. These would break out as follows:
Net = No Discount
Short – Less than REG – Example 30, 25, 20, 15, 10
Long – Larger than REG – Example 50, 55, 60

179to180 – Publisher status code - See reference file, inventory\prodstat.txt for complete list of
values.

AB,"Canceled" ---- We don’t need this

CS,"Availability Uncertain" ---- We don’t need this

EX,"No longer stocked by Ingram" --- We don’t need this

IP,"In print and available" ---- Need to take this

NY,"Not yet published" ----- Need to take this

OI,"Out of stock indefinitely" ---- We don’t need this

OP,"Out of print" ---- We don’t need this

PP,"Postponed indefinitely" ----- We don’t need this

RF,"Referred to another supplier" ---- We don’t need this

RM,"Remaindered" ----- We should take it

TP,"Temporarily out of stock because publisher cannot supply" ----- We don’t need this

WS,"Withdrawn from Sale" ---- We don’t need this

181to188 – Y is important for us and we should consider this only, if it is N, 39to87 respective
warehouse qty should be considered 0. Ideally it would be zero if it is N.

197to204 --- If the value is “000000” it should be considered for sale

205 - returnable indicator - need to save it and as we will use it

206to213 - return date ---- Date = "00010101" – the title is NOT Returnable. Date = "99991231"
– a Return Date has not been set.
219 – Backorder only indicator - Consider Y, 1 & 2

220 - Media Mail indicator - this is imp field and help us in calculating shipping cost

221 – Product type – this will define category

R Hardcover - also called cloth, retail trade, or trade --- Book Category

Q Quality Paper - also called trade paper --- Book Category

P Mass Market Paperbacks - always rack size --- Book Category

T Calendars, blank books, and other book-like sideline items - --- Book Category

W Audio - Music category

S Computer Software or Multimedia - Software & Computer/Video Games category

K Video - Video & DVD

X Music Titles - Music

M Gifts, Cards, and other non-book sideline items - need to save this in separate table as this is
for GM.

N Musicland Bargain Books - Books

U Other Spring Arbor - - need to save this in a separate table as this is for GM. Need to check if
position one has 0 in all or some other value also.

222 - Imprintable Indicator - Need to go for “N” only

223 - Indexable Indicator - Need to go for “N” only

239to244 - need to make sure it is not 000000, if it is needed to insert but send notification so
that we can add weight.

256to259 – Need to check if there are blank or 00000 values. This 4 digit code are assigned to
the publisher name, check file ibspubnum.txt. We need to save the publisher name using the
code.

265 – Restricted Code

" ","No Restrictions." ---- We will consider this


"C","Available in the United States and some other countries." ---- We
will consider this

"D","Only available in the United States." ---- We will consider this only
for domestic marketplaces

"E","Restricted: Licensing agreement required to order the title." --- We


will not consider this

"I","International only title, not available in the United States." ----


We need this only for International marketplaces.

"L","Available to library customers only" ---- We will not consider this.

"P","Restricted: Not available to all customers." ---- We will not


consider this.

"V","Only available in the United States format." ---- We will consider


this only for domestic marketplaces

"X","Available in some countries, not the United States." ---- We need


this only for International marketplaces.

The Inventory Country Exclusion file we need to take from FTP. File
Location \inventory File Name ctry_excl_titles.txt Availability • Download
from FTP server • Monday by 1:00am CST • Updated weekly • Reference File
country.txt Details on Page 15. Need to take one file and see how they are
handling EAN which has multiple country exclusion. We will need these
details to make a decision.

272-273----

10,"Not yet available","Not yet available" ---- Not imp, no need to save

20,"Available","Available from Ingram (form of availability unspecified)" --- Very imp, need to
take this only

21,"In stock","Available from Ingram as a stocked item" ---- Very imp, need to take this only

22,"To order","Available from Ingram as a non-stocked item, for special order" --- not imp, we
can ignore this

31,"Out of stock","Stocked item - temporarily out of stock" --- we should save it so that as soon
as it is instock it goes live

40,"Not available","Not available from Ingram (reason unspecified)" ---- no need to consdier

274to282 --- imp as it will be used while creating PO.


283to284 - Product Classification Type - we will avoid these titles

Inventory Estimated Time of Arrival (ETA) File This file provides an estimate time of arrival
(ETA) for titles that are temporarily out of stock and On-Order with the publisher. Ingram uses a
historical publisher fulfillment performance to calculate the ETA for each of the Ingram
Distribution Centers (DC). File Location \inventory File Name [email protected] ----- We dont need
to read this file as once they are live or available it will automatically reflect in Delta files.

Inventory International Price File This file supplies Ingram customers with the publisher supplied
price for International currencies. Ingram currently captures the Suggested Retail (SRP) and the
Publisher Supplied Price in United States and Canadian dollars. Please see the StockV2 file for
the U.S. Suggested Retail Price and the U.S. Publisher Supplied Price. File Location \inventory
File Name [email protected] --- We dont need this.

Inventory Removal File This file provides a list of titles that are no longer stocked by Ingram.
This file would be used if you miss delta Stockv2 files and are unable to process the full stock
file to refresh your system. Then you would use the list to remove item from your system just as
you would the Stockv2 file. Items in this file will have a Product Availability code of “40” – “Not
Available from Ingram” in positions 272-273. There are 4 files options, depending on the time
frame you would like to process the file: • 2 days • 15 days • 30 days • 180 days File Location
\inventory --- we should run this file every 15 days.

Proposition 65 Warnings File This file provides a list of product warnings provided by the
publisher to meet the State of California’s Proposition 65 warnings that are required by law to be
displayed on websites/in store to California Customers. Proposition 65 requires businesses to
provide warnings to Californians about significant exposures to chemicals that cause cancer,
birth defects or other reproductive harm. These chemicals can be in the products that
Californians purchase, in their homes or workplaces, or that are released into the environment.
File Locations \inventory \Inventory_Reference_Files File Name Prop65_Warnings.txt --- This is
a very imp file, we need to take this and mark EAN where this is applicable and save the
narrative, this we will use while listing.

Publisher Sales Rights These files provide a Sales Rights by Country, Territory, or Rest of World
provided by the publisher where products can be sold. There are 3 files to determine Sales
Rights with 3 translation code files, specifications for each are listed below. You will want to
check for the Country and Territory Sales Rights Restriction or Inclusion by knowing the Sale
Right Type Code and also reviewing the Rest of World (ROW) file. File Locations
\inventory\SalesRights File Name SalesRights_Country.txt SalesRights_ROW.txt
SalesRights_Territory.txt Country_ONIX.txt (translation file) SalesRightsTypeCd.txt (translation
file) Territory_ONIX.txt (translation file)

We will need all the 6 files to review to come to a conclusion on logic.

Batch ingestion of book data from


Ingram

Goal
Build a scalable, maintainable, extendable service to capture price and quantity updates from
vendors. The goal is to keep the inventory as fresh as possible keeping in mind various
engineering challenges. In this case, we need to ingest price and quantity data from Ingram.

High-level requirements
1. Capture price and quantity updates from Ingram.
2. Update the information in your own database and market places.
3. Provide a real-time view of the ingestion pipeline.

Capture price and quantity updates from Ingram

Step 1
We want to capture the data provided by Ingram’s FTP server.
Under the ideal case, we are mainly interested in 2 files:
Data File File Name File Type Avg Size

StockV2 Full [email protected] Full 4GB


p

StockV2 Delta stockv2deltaYYMM Delta


[email protected]

StockV2 Full: This is the "master" file, which contains all active Ingram titles that are available
to sell and is generated six times per day.

StockV2 Delta: This Delta file contains all incremental updates for titles where select fields
have changed, and a record has been added or deleted since the previous file was created.
This file is created by comparing the current and previous master files (StockV2 Full). A unique
file name is used for each new file.

With the ideal case, we would ingest StockV2 Full once and all StockV2 Delta files daily.

Step 2
Post download of the files, we need to filter the data from the files keeping in mind various
business requirements.

The StockV2 files follow the following specifications:


● File Format: fixed-width text file
● Record Length: 301 characters
● Text Qualifier: none
● Line Terminator: carriage return line feed
● Numeric Fields: right justified and zero filled
● Decimal: two decimals implied for all Price and Weight fields
● Date/Time Format: YYYYMMDD
● Field Format Key:
○ A, N – Alpha Numeric (Text / String)
○ DT–Date/Time
○ N – Numeric

Level 1 Filters
Following are the Filters we need to apply to the StockV2 files.

Position Length Field Name Filter Business


justification

1 1 GTIN Prefix = “0”


179 – 180 2 Publisher Status = “IP”
Code = “NY”
= “RM”

181 1 La Vergne, TN = “Y”


Stock Flag

182 1 Roseburg, OR = “Y”


Stock Flag

183 1 Ft. Wayne, IN = “Y”


Stock Flag

184 1 Chambersburg, = “Y”


PA Stock Flag

185 1 Allentown, PA = “Y”


Stock Flag

186 1 Fresno, CA = “Y”


Stock Flag

187 1 DC Stock Flag = “Y”

197 – 204 8 On Sale Date = “000000”


<= current date
+1

219 1 Backorder Only = "Y”


Indicator = “1”
= “2”

221 1 Product Type = “R"


= "Q"
= "P"
= "T"
= "N"

222 1 Imprintable = “N”


Indicator

223 1 Indexable = “N”


Indicator

239 – 244 6 Weight != “000000”

265 1 Restricted Code = “ “(blank)


= “C”
= “D”
= “I”
= “V”
= “X”

272 – 273 2 Product = 20


Availability Code = 21

Transformations

Position Length From Field From To Field To Value Business


Name Value Name justification

15 – 28 14 UPC Remove 2
zero’s from
prefix

39–45 7 Total 39–45 +


46–52 7 Quantity 46–52 +
53–59 7 53–59 +
60–66 7 60–66 +
67–73 7 67–73 +
74–80 7 74–80 +
81–87 7 81–87 +
88-94 7 88–94

151-157 7 Price after Price -


166 - 168 3 discount Discount
Percent

166 – 168 3 Discount REG Discount 40%


Level NET Percent 0%

220 1 Media Mail Y Shipping Y -> Media


Indicator N Cost mail rate
chart
N ->
Another
rate chart

256 – 259 4 Ingram Ingram See


Publisher Publisher mapping in
Number Name ipspubnum
.txt

In Step 2, we have level 1 filters defined. Level 1 filters are applied such that the filtered titles fit
our business across marketplaces. Additional filters will be defined below as per the
requirements for each geographic location. Examples of level 2 filters:
- The Inventory Country Exclusion file
- Proposition 65 Warnings File
- Publisher Sales Rights

In Step 2, we also have transformation defined. Transformations are done to help us prepare
the title data in the right format for listing on various marketplaces or as per the requirements of
downstream vendors.

Step 3
Post filtering and transformations, we need to convert the listings in the output format required
by AOB and ADP

ADP

Position Length Field Name

274 – 282 9 Ingram Title Code

2 – 14 13 EAN

Transformed field Transformed field UPC

29–38 10 ISBN-10

Transformed field Transformed field Total quantity

151 – 157 7 Price

Transformed field Transformed field Price after discount

Transformed field Transformed field Shipping cost

239 – 244 6 Weight

220 1 Media Mail indicator

205 1 Returnable indicator

221 1 Product Type

Restrictions

Narration
AOB

Position Length Field Name

2 – 14 13 EAN

29–38 10 ISBN-10

Transformed field Transformed field Price after discount) +


Shipping cost

151 – 157 7 Price

Transformed field Transformed field Total quantity

List Flag

File download Schedule

File File name Download after Download Dependent


frequency file

Inventory ctry_excl_titles.txt Monday by Weekly country.txt


Country 1:00am CST
Exclusion file

Proposition Prop65_Warnings.txt Available by Weekly


65 Warnings Sunday 8:00am
File CDT

Publisher SalesRights_Country.txt Available by Weekly


Sales Rights SalesRights_ROW.txt Monday
SalesRights_Territory.txt 8:00am CDT
Country_ONIX.txt (translation
file) SalesRightsTypeCd.txt
(translation file)
Territory_ONIX.txt
(translation file)

StockV2 Full [email protected] Available by 1 Daily


am Central time
daily
StockV2 stockv2deltaYYMMDDx@ing Available by 1 Multiple
Delta ram.zip am Central time times per
daily day

Technical design document


Technical design for Ingram Feed Processor

Project Plan

Task Approx days(3 - Start Time Deadline Current Status


4 hours per day)

Initial June 23, 2022 July 18, 2022 Done


requirements

Initial Technical 3 July 25, 2022 Aug 1, 2022 Done


design
document

Technical design 2 Aug 1, 2022 Aug 8, 2022 Pending


approval

Finalize 1 Aug 8, 2022 Aug 9, 2022 Pending


technical design
document

Proof of Concept 5 Aug 9, 2022 Aug 15, 2022 Pending

Development 12 Aug 16, 2022 Aug 30, 2022 Pending

Testing 6 Sept 1, 2022 Sept 9, 2022 Pending

Pending Tasks

Task Assigned To

GTN value confirmation [email protected]

Publisher Rights files logic [email protected]


The Inventory Country Exclusion file logic [email protected]

Proposition 65 Warnings File filtering Viraj Shah

Understand deeper Restriction Code and Viraj Shah


Narration for ADP and AOB

Get sample files from FTP server [email protected]

Sample output files for AOB and ADP [email protected]

Integrate Shipping costs logic Viraj Shah

Project plan with timelines Viraj Shah

Initial technical design Viraj Shah

Review technical design to choose one [email protected]

Future tasks
Convert Product Type Ingram code in ADP. Please See comments by [email protected]
Understanding updating existing items in ADP.

Technical design for Ingram Feed


Processor
Component diagrams

Download files from Ingram SFTP


We need to download files like Stock V2, Stock Delta, Inventory Country Exclusion file, Prop65
warning file, and Publisher Sales rights file and store them in an S3 bucket.

Proposed solutions
Option 1: Use CloudWatch with AWS Lambda to trigger the download of files on a specific
schedule

Advantages:
1. Serverless: We don’t really need to worry about managing compute resources.
2. Monitoring is easier.
3. Separation of tasks with CloudWatch events rule.

Disadvantages:
1. Uses AWS native technologies so migrating to a different cloud provider would mean a
few changes.
Option 2: Use an EC2 instance to download files from Ingram SFTP to Amazon S3

Advantages:
1. Code will be cloud vendor agnostic.

Disadvantages:
1. Maintenance, monitoring burden on developers to make sure the solution is fault
tolerent.

Process helper files


As part of the Ingram stock file, we need to use the helper files like Inventory Country Exclusion
file, Prop65 warning file, and Publisher Sales rights file to transform and filter the items as per
the business need. As these helper files are not updated as frequently as the stock file and are
smaller in size, we need to ingest them separately. These ingested files would then need to be
persisted in a database.

Proposed solutions

Option 1: Use a relation DB with Amazon Aurora or Amazon RDS to store the helper files and
AWS lambda that is triggered whenever new files are available in S3 buckets.
Advantages:
1. Using RDS with MySQL would mean we use existing technology in our tech stack.
2. Easy to replicate in a data warehouse.
3. Query syntax is known and can be used to run ad-hoc queries for analytical purposes.

Disadvantages:
1. If the data size increases RDS can turn out to be expensive.
2. RDS is optimized for storage over reads and writes.
Option 2: Use DynamoDB as it is optimized for high volume and throughput

Advantages:
1. DynamoDB is perfect for high throughput applications.
2. DynamoDB is optimized for reads and writes.
3. Cost effective compared to Relational DB.
Disadvantages:
1. Doesn’t support complex joins.

Filter, Transform and Enrich Stock V2 files


This is the main file that needs to be processed. The format of the StockV2 full file and StockV2
delta files are the same and the solutions to both can be used interchangeably. The goal is to
design a system so that the stock records are filtered, transformed and enriched in such a way
that can be output into multiple formats.

Proposed solutions
Option 1: Use AWS Step functions
With AWS step functions, we define a main orchestrator step function that is responsible for
splitting the file into multiple smaller files and store them in S3. These smaller files are handled
by other chunk step functions which are responsible for filtering, transforming and enriching the
stock records and store the file in S3. After smaller chunks are processed, the main orchestrator
would then be responsible for merging the smaller files and storing the data in S3.

Steps
1. S3 event notification trigger the lambda function once the stock files are available.
2. Lambda function triggers the main step function.
3. Main step function splits the large file into smaller files.
4. For each smaller file trigger the chunk step function.
5. In the chunk step function
a. Transform each record and write into S3.
b. If record transformation fails, write into Dynamo DB.
6. Merge the processed chunk files in the main step function.
7. Store monitoring info from the main step function in the Dynamo DB.
Option 2: Use Amazon Kinesis
With Amazon Kinesis, we make the process of ingesting ingram stock items in streaming
fashion. What this means is that in the future, if we have a stream of real time events that come
in with latest stock inventory, this solution would work pretty well compared to the complete
batch solution above.

Steps
1. Trigger a Lambda function as soon as the stock file is available in S3.
2. This lambda function would split the stock file into multiple smaller files.
3. The smaller files are then read enriched by a lambda function.
4. The enriched written to Amazon Kinesis Data Streams.
5. All this while, any monitoring updates and errors are written into Dynamo DB.

Convert enriched stock records to required output formats


Once the stock records are enriched, we need to convert them to the required output format.
Proposed solutions
Option 1: Use Lambda function
The most simplest use case is to use a lambda function to read data either from S3 or Kinesis
Data Streams and write in different output formats. We would need one more more such lambda
functions for each unique output format.

Option 2: Use Kinesis Firehose


Using Kinesis Firehose is a good option when you want to write the data into multiple output
formats. For example: We may want to write the enriched stock items into AOB and ADP file
formats and store them in S3. We may also want to write these records into a data warehouse
like Redshift for analytical purposes. Kinesis firehose would help with that.
Chosen Solution

Download files from Ingram SFTP and Process them


1. We will have an EC2 instance that downloads the file from the Ingram SFTP server.
2. The process unpacks the downloaded files and stores them locally in an attached
volume.
3. Process the downloaded reference files.
a. Add the reference file information to the respective database tables in MySQL.
b. Only if data is changed, filter and directly add the additional data fields to the
stock database in MySQL.
File name Available Schedule Process schedule

Prop65_Warnings.txt Available by Sunday 8:00am Monday 9:00 AM IST


CDT

SalesRights_Country.txt Available by Monday 8:00am Tuesday 9:00 AM IST


SalesRights_ROW.txt CDT
SalesRights_Territory.txt
Country_ONIX.txt (translation
file) SalesRightsTypeCd.txt
(translation file)
Territory_ONIX.txt (translation
file)

Shipping cost TBD TBD


Download country exclusion file from Ingram SFTP and Process
them
1. We will have an EC2 instance that downloads the file from the Ingram SFTP server.
2. The process unpacks the downloaded files and stores them locally in an attached
volume.
3. Process the downloaded country exclusion files.
a. Add the country exclusion file information to Mongo DB.
b. Only if data is changed, filter and directly add the country exclusion data to the
stock database in MySQL.

File Name Available Schedule Process Schedule

ctry_excl_titles.txt Monday by 1:00am CST Wednesday 9:00 AM IST

Download and process Stock V2 files from Ingram SFTP


1. We will have an EC2 instance that downloads the file from the Ingram SFTP server.
2. The process unpacks the downloaded files and stores them locally in an attached
volume.
3. In the EC2 instance, for each record in the stock v2 file, apply the filter. The filter would
help us filter out items we are not interested in ingesting.
4. After the item is filtered, we would transform it by doing 2 things.
a. Perform aggregations(Inventory)
b. Append restriction data by looking at other data sources in MySQL and Mongo
DB.
c. Calculate shipping cost if applicable
5. Post transformations, we finally persist the record into MySQL.

File Name Available Schedule Process Schedule

[email protected] Available daily at 1,5,9 am One time only


and pm central time except
for Sunday when it is
available at 5, 9 am, and 1,
5, 9 pm central time

[email protected] Available daily at 1,5,9 am Every file that is


and pm central time except available. So
for Sunday when it is 2, 6, 10 am and 2, 6,
available at 5, 9 am, and 1, 10 pm central time
5, 9 pm central time

MySQL Table Schemas


Prop_65 warnings Data

Table Name: prop_65_warnings

Field Name Field Type Restrictions

ean char(13) Primary key

warning_narrative varchar(300) Not null

Stock_v2 file

Table Name: stock_v2

Field Name Field Type Restrictions

ean char(13) Primary Key

gtin_prefix char(1)

upc char(14)

isbn_10 char(10)

price DECIMAL(7,2)

discount_level char(3)

publisher_status_code char(2)

on_sale_date date

returnable_indicator char(1)

return_date date

backorder_only_indicator char(1)

media_mail_indicator char(1)

product_type char(1)
imprintable_indicator char(1)

indexable_indicator char(1)

weight decimal(6,2)

restricted_code char(1)

product_availablity_code char(2)

ingram_title_code char(9)

upc_t char(12)

total_quantity_t int unsigned

discount_percent_t small_int unsigned

price_after_discount_t decimal(7,2)

shipping_cost_t small_int unsigned

ingram_publisher_name varchar(100)

You might also like