Coding_Test_problem_description_RA_final

The AI-GA dataset contains 14,331 abstracts, split between 7,248 AI-generated and 7,083 original samples, formatted in CSV with columns for doc_id, abstract, title, and label. The document outlines three coding tasks: identifying the top 10 most frequent words in different categories of abstracts, finding 10 abstracts similar to a specific one, and building a deep learning model to classify abstracts. Submissions should include Python code, outputs, requirement files, and Readme files detailing methods and implementations.

Uploaded by

Gowtham Saravanan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Coding_Test_problem_description_RA_final

Uploaded by

Gowtham Saravanan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Description of Data

The AI-GA (Artificial Intelligence Generated Abstracts) dataset is a collection of abstracts, either AI-
generated or original. The AI-generated abstracts are generated using state-of-the-art language
generation techniques, specifically, the GPT-3 model.

The dataset is provided in the CSV format, with each row representing a single sample.

Total sample size: 14,331 (7,248 AI-generated and 7,083 original)

Each sample contains four columns: doc_id, abstract, title, and label. The label indicates whether the
sample is an original abstract (labeled as 0) or an AI-generated abstract (labeled as 1).

Coding tasks

Task 1. Find the top 10 most frequent words in AI-generated, original, and all the abstracts respectively

Task 2. Using a natural language processing (NLP) method, identify 10 abstracts from the whole dataset
that are most similar to the 5th abstract (doc_id = 4) in their content.

Task 3. Build a neural network/deep learning model to predict whether an abstract is AI-generated or
original. You are encouraged to use common deep learning frameworks such as PyTorch or TensorFlow.

What to Submit

Please submit all the required files (with proper file names) in a single .zip file through email.

Task 1.

a. Python code

b. The top 10 words in each of those three categories (AI-generated, original, AI-generated+original)

Task 2:

a. Python code
b. Output from your code: 10 abstracts most similar to the 5th abstract

c. A requirement.txt specifying all dependent libraries and their versions being used

d. A Readme file that describes

- The key idea and reasoning behind your method for measuring the content similarity between
abstracts and why you chose this method

- What has been implemented and what might have been left out due to time limit

Task 3.

a. Python code

b. Output from your code that shows the performance of your classifier

c. A requirement.txt specifying all dependent libraries and their versions being used

d. A Readme file that describes

- The key idea and reasoning behind your algorithm/model and why you chose this type of
algorithm/model

- What has been implemented and what might have been left out due to time limit

- Key performance metrics used in training and testing

Python Essentials
From Everand
Python Essentials
Steven F. Lott
5/5 (7)
Python for Mechanical and Aerospace Engineering
From Everand
Python for Mechanical and Aerospace Engineering
Alexander Kenan
No ratings yet
Python For Data Science
From Everand
Python For Data Science
Kevin Clark
No ratings yet
Ilovepdf Merged Pagenumber
No ratings yet
Ilovepdf Merged Pagenumber
59 pages
Easy Programming for Everyone
From Everand
Easy Programming for Everyone
Umar Asghar
No ratings yet
Report4 Merged Organized 1 30
No ratings yet
Report4 Merged Organized 1 30
30 pages
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Computer Programming The Doctrine
From Everand
Computer Programming The Doctrine
Adesh Silva
No ratings yet
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
Unit 1 Intoduction to Generative AI
No ratings yet
Unit 1 Intoduction to Generative AI
8 pages
Quick Python Guide
From Everand
Quick Python Guide
Coder1
No ratings yet
Generative AI (1)
No ratings yet
Generative AI (1)
16 pages
Simplifying Data Science With Python
From Everand
Simplifying Data Science With Python
Billy David millican
No ratings yet
Mastering Python: A Comprehensive Guide for Beginners and Experts
From Everand
Mastering Python: A Comprehensive Guide for Beginners and Experts
Rick Spair
No ratings yet
Learn Python in One Hour: Programming by Example
From Everand
Learn Python in One Hour: Programming by Example
Victor R. Volkman
3/5 (2)
Collection of Raspberry Pi Projects
From Everand
Collection of Raspberry Pi Projects
Guillermo Perez Guillen
5/5 (1)
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Mastering Python Programming: A Comprehensive Guide: The IT Collection
From Everand
Mastering Python Programming: A Comprehensive Guide: The IT Collection
Christopher Ford
5/5 (1)
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
67 pages
Python Algorithms Step by Step: A Practical Guide with Examples
From Everand
Python Algorithms Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Week 1 - Introduction To SDGAI
No ratings yet
Week 1 - Introduction To SDGAI
36 pages
Generative AI Text Classification Using Ensemble LLM Approaches
No ratings yet
Generative AI Text Classification Using Ensemble LLM Approaches
8 pages
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
AutexTification IberLEF 2023 3 Junio
No ratings yet
AutexTification IberLEF 2023 3 Junio
18 pages
Project5 Fake Text Detection
No ratings yet
Project5 Fake Text Detection
4 pages
Mastering Python
From Everand
Mastering Python
Rick van Hattem
No ratings yet
A simple yet Efficient Ensemble Approach For Ai -generated Text Detection
No ratings yet
A simple yet Efficient Ensemble Approach For Ai -generated Text Detection
9 pages
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Python for Finance
From Everand
Python for Finance
Yuxing Yan
2.5/5 (4)
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
researchprimeguide
No ratings yet
researchprimeguide
3 pages
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
Mastering Python Scientific Computing: A complete guide for Python programmers to master scientific computing using Python APIs and tools
From Everand
Mastering Python Scientific Computing: A complete guide for Python programmers to master scientific computing using Python APIs and tools
Hemant Kumar Mehta
4/5 (1)
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
A Pathway Towards Responsible AI Generated Content
No ratings yet
A Pathway Towards Responsible AI Generated Content
12 pages
AI Generated Content Detection
No ratings yet
AI Generated Content Detection
8 pages
CHATGPT
No ratings yet
CHATGPT
12 pages
A Beginner's guide to Python
From Everand
A Beginner's guide to Python
Steven Mcananey
No ratings yet
Mastering Python in 7 Days
From Everand
Mastering Python in 7 Days
Alex Wood
No ratings yet
ArIES Open Projects ML
No ratings yet
ArIES Open Projects ML
6 pages
Aspireit - Artificial Intelligence Engineer
No ratings yet
Aspireit - Artificial Intelligence Engineer
4 pages
Essential Algorithms: A Practical Approach to Computer Algorithms
From Everand
Essential Algorithms: A Practical Approach to Computer Algorithms
Rod Stephens
4.5/5 (2)
A Survey On ChatGPT AI-Generated Contents, Challenges, and Solutions. 2024.23s.
No ratings yet
A Survey On ChatGPT AI-Generated Contents, Challenges, and Solutions. 2024.23s.
23 pages
New Text Document
No ratings yet
New Text Document
3 pages
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Analysing Chatgpt's Potential Through The Lens of Creating Research Papers
No ratings yet
Analysing Chatgpt's Potential Through The Lens of Creating Research Papers
17 pages
Swift 3 Object-Oriented Programming - Second Edition
From Everand
Swift 3 Object-Oriented Programming - Second Edition
Gastón C. Hillar
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
JavaScript Introduction
From Everand
JavaScript Introduction
Lisa Saldivar
No ratings yet
Introduction To Generative AI
No ratings yet
Introduction To Generative AI
2 pages
C++ Programming: Effective Practices and Techniques
From Everand
C++ Programming: Effective Practices and Techniques
Joe Smith
No ratings yet
Report4 Merged Organized 31 59
No ratings yet
Report4 Merged Organized 31 59
29 pages
AI Content Self-Detection For Transformer-Based Large Language Models
No ratings yet
AI Content Self-Detection For Transformer-Based Large Language Models
12 pages
Algorithms and Data Structures: An Easy Guide to Programming Skills
From Everand
Algorithms and Data Structures: An Easy Guide to Programming Skills
Rigdon Jonathan
No ratings yet
Object–Oriented Programming with Swift 2
From Everand
Object–Oriented Programming with Swift 2
Hillar Gastón C.
No ratings yet
Individual Report - CA 2 - 20000086
No ratings yet
Individual Report - CA 2 - 20000086
3 pages
Coding In C Decoded: Decoded, #1
From Everand
Coding In C Decoded: Decoded, #1
D Brown
No ratings yet
10.1515 - Opis 2022 0158
No ratings yet
10.1515 - Opis 2022 0158
24 pages
CHEAT A Large-Scale Dataset For Detecting ChatGPT
No ratings yet
CHEAT A Large-Scale Dataset For Detecting ChatGPT
9 pages

Coding_Test_problem_description_RA_final

Uploaded by

Coding_Test_problem_description_RA_final

Uploaded by

Description of Data

Total sample size: 14,331 (7,248 AI-generated and 7,083 original)

d. A Readme file that describes

d. A Readme file that describes

- Key performance metrics used in training and testing

You might also like