Samarth Aggarwal

Palo Alto, California, United States

2K followers 500+ connections

View mutual connections with Samarth

Welcome back

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

or

New to LinkedIn? Join now

Join to view profile

Glean

University of Illinois Urbana-Champaign

Activity

When the whole industry start copying you - You have done a good work. We have been serving a lot of creators in JEE industry. Our USP lies in the…

When the whole industry start copying you - You have done a good work. We have been serving a lot of creators in JEE industry. Our USP lies in the…

Liked by Samarth Aggarwal
By next year, AI gains should be seen on balance sheets, otherwise, we have a problem. Thank you, Deirdre Bosa and CNBC Tech Check, for having…

By next year, AI gains should be seen on balance sheets, otherwise, we have a problem. Thank you, Deirdre Bosa and CNBC Tech Check, for having…

Liked by Samarth Aggarwal
As I’ve been reflecting on Glean’s historic last week: $260+ million Series E round at a $4.6 billion valuation, new product capabilities to enable…

As I’ve been reflecting on Glean’s historic last week: $260+ million Series E round at a $4.6 billion valuation, new product capabilities to enable…

Liked by Samarth Aggarwal

Join now to see all activity

Experience

Glean

Palo Alto, California, United States
-

Palo Alto, California, United States
-

Menlo Park, California, United States
-

Gurugram, Haryana, India
-

Bengaluru Area, India
-

Gurgaon, India

Education

University of Illinois Urbana-Champaign

Teaching Assistantships:
CS425: Distributed Systems, Prof. Indranil (Indy) Gupta
CS441: Applied Machine Learning, Prof. Marco Morales Aguirre
Student mentor to 7 freshmen.
Teaching Assistant for Intro to Computer Science.

Volunteer Experience

Mentor

Sewa Bharti - India

Nov 2017 - Jan 2018 3 months

Education

Tutored underprivileged students on the basics of computers and software.
Student Mentor

National Service Scheme, IITD

Jul 2016 - Aug 2017 1 year 2 months

Environment

Dedicated >100 hours to NSS IITD in these projects.
1) Arohan - Taught and mentored JEE aspirants
2) Apna Parivaar - Raised funds for an orphanage 'Apna Parivaar', planted saplings in their garden, organized various cultural and educational activities for those children.
3) Toys from Trash - Made toys that exhibit some scientific principle using trash, demonstrated their making and scientific principle behind them at different schools.

Publications

Goal-Driven Command Recommendation for Analysts

Proceedings of the 2020 ACM Conference on Recommender Systems (RecSys) July 23, 2020

Recent times have seen data analytics software applications become an integral part of the decision-making process of analysts. The users of these software applications generate a vast amount of unstructured log data. These logs contain clues to the user intentions, which traditional recommender systems may find difficult to model implicitly from the log data. With this assumption, we would like to assist the analytics process of a user through command recommendations. We categorize the…

Recent times have seen data analytics software applications become an integral part of the decision-making process of analysts. The users of these software applications generate a vast amount of unstructured log data. These logs contain clues to the user intentions, which traditional recommender systems may find difficult to model implicitly from the log data. With this assumption, we would like to assist the analytics process of a user through command recommendations. We categorize the commands into software and data categories based on their purpose to fulfill the task at hand. On the premise that the sequence of commands leading up to a data command is a good predictor of the latter, we design, develop, and validate various sequence modeling techniques. In this paper, we propose a framework to provide intent-driven data command recommendations to the user by leveraging unstructured logs. We use the log data of a web-based analytics software to train our neural network models and quantify their performance, in comparison to relevant and competitive baselines. We propose a custom loss function to tailor the recommended data commands according to the intent provided exogenously. We also propose an evaluation metric that captures the degree of intent orientation of the recommendations. We demonstrate the promise of our approach by evaluating the models with the proposed metric and showcasing the robustness of our models in the case of adversarial examples, where the selected intent is misaligned with user activity, through offline evaluation.

See publication
IMoJIE : Iterative Memory based Joint Open Information Extraction

Association of Computational Linguistics (ACL) 2020 March 20, 2020

While traditional systems for Open Information Extraction were statistical and rule-based, recently neural models have been introduced for the task. Our work builds upon CopyAttention, a sequence generation OpenIE model (Cui et. al., 2018). Our analysis reveals that CopyAttention produces a constant number of extractions per sentence, and its extracted tuples often express redundant information.
We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned…

While traditional systems for Open Information Extraction were statistical and rule-based, recently neural models have been introduced for the task. Our work builds upon CopyAttention, a sequence generation OpenIE model (Cui et. al., 2018). Our analysis reveals that CopyAttention produces a constant number of extractions per sentence, and its extracted tuples often express redundant information.
We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracted tuples. This approach overcomes both shortcomings of CopyAttention, resulting in a variable number of diverse extractions per sentence. We train IMoJIE on training data bootstrapped from extractions of several non-neural systems, which have been automatically filtered to reduce redundancy and noise. IMoJIE outperforms CopyAttention by about 18 F1 pts, and a BERT-based strong baseline by 2 F1 pts, establishing a new state of the art for the task.

See publication
CaRB: A Crowdsourced Benchmark for OpenIE

Empirical Methods in Natural Language Processing (EMNLP) 2019 August 14, 2019

Open Information Extraction (Open IE) systems have been traditionally evaluated via manual annotation. Recently, an automated evaluator with a benchmark dataset (OIE2016) was released – it scores Open IE systems automatically by matching system predictions with predictions in the benchmark dataset. Unfortunately, our analysis reveals that its data is rather noisy, and the tuple matching in the evaluator has issues, making the results of automated comparisons less trustworthy.

We…

Open Information Extraction (Open IE) systems have been traditionally evaluated via manual annotation. Recently, an automated evaluator with a benchmark dataset (OIE2016) was released – it scores Open IE systems automatically by matching system predictions with predictions in the benchmark dataset. Unfortunately, our analysis reveals that its data is rather noisy, and the tuple matching in the evaluator has issues, making the results of automated comparisons less trustworthy.

We contribute CaRB, an improved dataset and framework for testing Open IE systems. To the best of our knowledge, CaRB is the first crowdsourced Open IE dataset and it also makes substantive changes in the matching code and metrics. NLP experts annotate CaRB’s dataset to be more accurate than OIE2016. Moreover, we find that on one pair of Open IE systems, CaRB framework provides contradictory results to OIE2016. Human assessment verifies that CaRB’s ranking of the two systems is the accurate ranking. We release the CaRB framework along with its crowdsourced dataset.

See publication
OpenIE6: Iterative Grid Labeling and Coordination Analysis for Open Information Extraction

Empirical Methods in Natural Language Processing (EMNLP) 2020

A recent state-of-the-art neural open information extraction (OpenIE) system generates extractions iteratively, requiring repeated encoding of partial outputs. This comes at a significant computational cost. On the other hand, sequence labeling approaches for OpenIE are much faster, but worse in extraction quality. In this paper, we bridge this trade-off by presenting an iterative labeling-based system that establishes a new state of the art for OpenIE, while extracting 10x faster. This is…

A recent state-of-the-art neural open information extraction (OpenIE) system generates extractions iteratively, requiring repeated encoding of partial outputs. This comes at a significant computational cost. On the other hand, sequence labeling approaches for OpenIE are much faster, but worse in extraction quality. In this paper, we bridge this trade-off by presenting an iterative labeling-based system that establishes a new state of the art for OpenIE, while extracting 10x faster. This is achieved through a novel Iterative Grid Labeling (IGL) architecture, which treats OpenIE as a 2-D grid labeling task. We improve its performance further by applying coverage (soft) constraints on the grid at training time.
Moreover, on observing that the best OpenIE systems falter at handling coordination structures, our OpenIE system also incorporates a new coordination analyzer built with the same IGL architecture. This IGL based coordination analyzer helps our OpenIE system handle complicated coordination structures, while also establishing a new state of the art on the task of coordination analysis, with a 12.3 pts improvement in F1 over previous analyzers. Our OpenIE system, OpenIE6, beats the previous systems by as much as 4 pts in F1, while being much faster.

See publication

Patents

Intent-Based Command Recommendation Generation in an Analytics System

US16/928,888

See patent

Languages

English

Native or bilingual proficiency
Hindi

Native or bilingual proficiency
French

Elementary proficiency

More activity by Samarth

Everyone at Glean is incredibly proud right now. We've all worked hard to deliver the 𝟭𝟬𝟬𝘅 𝘂𝘀𝗲𝗿 𝗴𝗿𝗼𝘄𝘁𝗵, 𝟭𝟬𝘅 𝗿𝗲𝘃𝗲𝗻𝘂𝗲…

Everyone at Glean is incredibly proud right now. We've all worked hard to deliver the 𝟭𝟬𝟬𝘅 𝘂𝘀𝗲𝗿 𝗴𝗿𝗼𝘄𝘁𝗵, 𝟭𝟬𝘅 𝗿𝗲𝘃𝗲𝗻𝘂𝗲…

Liked by Samarth Aggarwal
We've got news 🎉 We've raised over $260M at a $4.6B valuation co-led by Altimeter and DST Global. And that's not all! We're thrilled to introduce…

We've got news 🎉 We've raised over $260M at a $4.6B valuation co-led by Altimeter and DST Global. And that's not all! We're thrilled to introduce…

Liked by Samarth Aggarwal

View Samarth’s full profile

See who you know in common
Get introduced
Contact Samarth directly

Join to view full profile

Other similar profiles

Explore more posts

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Samarth Aggarwal

38 others named Samarth Aggarwal are on LinkedIn

See others named Samarth Aggarwal

Add new skills with these courses

See all courses

Samarth Aggarwal

Palo Alto, California, United States 2K followers 500+ connections

See your mutual connections View mutual connections with Samarth Sign in Welcome back Email or phone Password Show Forgot password? Sign in or New to LinkedIn? Join now or New to LinkedIn? Join now

Activity

When the whole industry start copying you - You have done a good work. We have been serving a lot of creators in JEE industry. Our USP lies in the…

Liked by Samarth Aggarwal

By next year, AI gains should be seen on balance sheets, otherwise, we have a problem. Thank you, Deirdre Bosa and CNBC Tech Check, for having…

Liked by Samarth Aggarwal

As I’ve been reflecting on Glean’s historic last week: $260+ million Series E round at a $4.6 billion valuation, new product capabilities to enable…

Liked by Samarth Aggarwal

Experience

-

-

-

-

-

Education

Volunteer Experience

Mentor

Student Mentor

Publications

Proceedings of the 2020 ACM Conference on Recommender Systems (RecSys) July 23, 2020

Association of Computational Linguistics (ACL) 2020 March 20, 2020

Empirical Methods in Natural Language Processing (EMNLP) 2019 August 14, 2019

Empirical Methods in Natural Language Processing (EMNLP) 2020

Patents

US16/928,888

Languages

English

Native or bilingual proficiency

Hindi

Native or bilingual proficiency

French

Elementary proficiency

More activity by Samarth

Everyone at Glean is incredibly proud right now. We've all worked hard to deliver the 𝟭𝟬𝟬𝘅 𝘂𝘀𝗲𝗿 𝗴𝗿𝗼𝘄𝘁𝗵, 𝟭𝟬𝘅 𝗿𝗲𝘃𝗲𝗻𝘂𝗲…

Liked by Samarth Aggarwal

We've got news 🎉 We've raised over $260M at a $4.6B valuation co-led by Altimeter and DST Global. And that's not all! We're thrilled to introduce…

Liked by Samarth Aggarwal

View Samarth’s full profile

Other similar profiles

Bharat Khandelwal

Rupesh ..

Arpan Mangal

Satvik Gupta

Niloy Mukherjee

Achintya Sinha

Swethaanjali Nandagopal

Nikhil Nerkar

Ishu Dharmendra G.

Rohit Bose

Gauri Gupta

Sonali Singh

Madhav Sainanee

Nidhi Thakkar

Nikitha Rao

Saksham Goel

Kavita Maurya

Arushi Garg

Ujjawal Kumar

Deepanshu Vijay

Explore more posts

Explore collaborative articles

Others named Samarth Aggarwal

Samarth Aggarwal

Samarth Aggarwal

Samarth Aggarwal

Samarth Aggarwal

Add new skills with these courses

Modeling Market Prices Using Stochastic Processes with Wolfram Language

Objects in JavaScript: A Dynamic Data Structure

Deep Learning: Getting Started

Palo Alto, California, United States

2K followers 500+ connections

View mutual connections with Samarth

Welcome back

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

or

New to LinkedIn? Join now