0% found this document useful (0 votes)
360 views2 pages

AdvaRisk Assignment - Data Science Internship Round 1

The document provides two problem statements for an algorithm to match owner names from an application to a database and to transliterate Hindi names to English. For the first problem, the algorithm should match names with variations like reordered parts or abbreviations. For the second problem, the algorithm should transliterate Hindi names to English without external APIs by using rule-based natural language processing techniques. Sample names are provided to test the transliteration. The algorithms and code are due by August 30th.

Uploaded by

yashvi jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
360 views2 pages

AdvaRisk Assignment - Data Science Internship Round 1

The document provides two problem statements for an algorithm to match owner names from an application to a database and to transliterate Hindi names to English. For the first problem, the algorithm should match names with variations like reordered parts or abbreviations. For the second problem, the algorithm should transliterate Hindi names to English without external APIs by using rule-based natural language processing techniques. Sample names are provided to test the transliteration. The algorithms and code are due by August 30th.

Uploaded by

yashvi jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Case Study

(Deadline - 30th August 2021 submit before 2 pm to [email protected])

Submit the assignment in two files as mentioned below

1. Python code file

2. Write the logic used for the code in Microsoft Word. Also, if you are unable to code the problem
statement, then they can write their logical thinking in the word file

AdvaSmart is a product developed by AdvaRisk to identify frauds in the mortgage-based lending space.
The use case is not only applicable for home loans but also for loans against property, SME/promoter
funding.

The product relies on property transactions at a Sub-registrar office (SRO) working under the revenue
department of the state government for identifying suspicious transactions. Using complex name &
address matching algorithms, the product searches for property transactions in the database.

Problem statement 1: Owner name data in our database is not in standard format, due to which there is
a high chance of the system missing important records while performing the name and address matching
tasks.

Write an algorithm and its code to match the input with DB names.

 Given: You have an input in the first column and the string to be matched in the DB column i.e.,
in the second column.
 The output of your devised algorithm should return the results mentioned under the third column.
 Note: Solution without using any external API dependency would have brownie points.

Owner name received through application (Input) Owner name available in DB Result

विजय कुमार राठी विजय Match

विजय कुमार राठी कुमार विजय Match

विजय कुमार राठी िी के राठी Match

विजय कुमार राठी विनय राठी Not Match


विजय कुमार राठी श्री िीजय राठी Match

विजय कुमार राठी विजय राठी Match

विजय कुमार राठी विजय आर Match

विजय कुमार राठी विनय कुमार राठी Not Match

विजय कुमार राठी विजय कुमार राही Not Match

Problem statement 2: Write an algorithm and build its equivalent code to perform the language
transliteration on the above mentioned Names.

 Source Language is Hindi and Target Language should be English.


 The solution should not have any external API dependency.
 Brownie points for the correct transliteration of the following names - शंकर, अंबर, प्रद् यु म्न, ऐश्वयाा ,
िै नते य

Note: You may use basic rule-based NLP techniques also.

You might also like