Michel Tourn
New York, New York, United States
219 followers
210 connections
View mutual connections with Michel
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Michel
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
About
Google, Twitter, Kosmix, Yahoo, Inktomi, Verity
Excited by both algorithmic…
Activity
-
Google launches new search engine to help scientists find the datasets they need
Google launches new search engine to help scientists find the datasets they need
Liked by Michel Tourn
-
Building the next generation of data engineering with Delta Lake #sparkaisummit
Building the next generation of data engineering with Delta Lake #sparkaisummit
Liked by Michel Tourn
Experience
View Michel’s full profile
Other similar profiles
-
Congxing Cai
San Francisco Bay AreaConnect -
Jiabin zhang
San Francisco Bay AreaConnect -
Feng Zhuge
Palo Alto, CAConnect -
Xian Zhang
Los Angeles, CAConnect -
Jia Li
Sunnyvale, CAConnect -
Zhengyi Liu
San Francisco Bay AreaConnect -
Rakesh Aggarwal
San Francisco Bay AreaConnect -
Yemao Zeng
Mountain View, CAConnect -
Tao Guo
Seattle, WAConnect -
Yang Gu
United StatesConnect -
Sachin Patel
Greater HoustonConnect -
Ke Yang
Cupertino, CAConnect -
Guangyu Chen
Los Angeles Metropolitan AreaConnect -
Abhishek Suman
Redmond, WAConnect -
Ruochen Xu
Software Engineer @ Tiktok CS @UIUC'15
Mountain View, CAConnect -
Udit Valecha
San Francisco Bay AreaConnect -
Alexander Walczak
Google Engineer 👨💻 | NYC EMT 🚑
New York, NYConnect -
Navid Vafaei
Greater Seattle AreaConnect -
Chao Chen
Mountain View, CAConnect -
Romal Thoppilan
United StatesConnect
Explore more posts
-
Hendrix Liu
I just published a blog on how to fine tune LLMs and create custom datasets. In this blog, I cover: - What fine-tuning is and its benefits - When to fine-tune a model - Methods for fine-tuning, including full fine-tuning and parameter-efficient fine-tuning - How to prepare your custom dataset Check it out here: https://lnkd.in/gDcrrBGP
151 Comment -
Scale AI
LLMs have become more capable with better training and data. But they haven’t figured out how to “think” through problems at test-time. The latest research from Scale finds that simply scaling inference compute–meaning, giving models more time or attempts to solve a problem–is not effective because the attempts are not diverse enough from each other. 👉 Enter PlanSearch, a novel method for code generation that searches over high-level "plans" in natural language to encourage response diversity. PlanSearch enables the model to “think” through various strategies before generating code, making it more likely to solve the problem correctly. The Scale team tested PlanSearch on major coding benchmarks (HumanEval+, MBPP+, and LiveCodeBench) and found it consistently outperforms baselines, particularly in extended search scenarios. Overall performance improves by over 16% on LiveCodeBench from 60.6% to 77%. Here’s how it works: ✅ PlanSearch first generates high-level strategies, or "plans," in natural language before proceeding to code generation. ✅ These plans are then further broken down into structured observations and solution sketches, allowing for a wider exploration of possible solutions. This increases diversity, reducing the chance of the model recycling similar ideas. ✅ These plans are then combined before settling on the final idea and implementing the solution in code. Enabling LLMs to reason more deeply at inference time via search is one of the most exciting directions in AI right now. When PlanSearch is paired with filtering techniques—such as submitting only solutions that pass initial tests—we can get better results overall and achieve the top score of 77% with only 10 submission attempts. Big thanks to all collaborators on this paper including: Evan Wang, Hugh Zhang, Federico Cassano, Catherine Wu, Yunfeng Bai, William Song, Vaskar Nath, Ziwen H., Sean Hendryx, Summer Yue 👉 Read the full paper here: arxiv.org/abs/2409.03733
1792 Comments -
Kartik Talamadupula
We just released a blog on whether the use of multimodal feature transfer can detect deception (phenomena like sarcasm, irony, and condescension) in conversation. The short answer... Yes. This particular write-up profiles a subset of our work at Symbl.ai on multimodal large language models, and is research led by one of our summer interns, Benjamin R. (from Prof. Larry Heck's AVA Lab at Georgia Institute of Technology). We have a paper currently under submission on this work, and we can't wait to share the full details with all of you soon. Blog Link: https://lnkd.in/g6m_wFDi #multimodal #featuretransfer #LLMs #deception #conversation #sarcasm #irony #language #AI #ML #NLP #LargeLanguageModels #SummerInternship #ResearchPapers #ACL #Llama2 #SymblDotAI
16 -
Tejaswi Tenneti
⚡ New blog post about our work on hybrid retrieval for #Instacart search While it's widely recognized that adopting a hybrid retrieval system (keyword + vector) can enhance performance, creating a bespoke solution tailored to our specific business needs proved to be quite a challenge! In our latest blog post, we explore: 1. The current search retrieval architecture at Instacart. 2. The challenges we faced, including the variance in search query distribution and the number of relevant documents per query. 3. Our methods for contextually adapting the recall set from each retrieval source to boost the precision and recall of the final results. Super proud of the team who made this possible Vinesh Gudla Prakash Reddy Putta Ankit mittal Andrew Tanner Guanhua S. Xiao Xiao Taesik Na Alex Charlton Xukai Tang Akshay Nair Haixun Wang
52 -
Tony Lee
I'm excited to share two papers from the Retrieval team at Walmart that were presented at CIKM last week. They're both about improving the relevance of Embedding-Based Retrieval. 𝗥𝗲𝗹𝗲𝘃𝗮𝗻𝗰𝗲 𝗙𝗶𝗹𝘁𝗲𝗿𝗶𝗻𝗴 𝗳𝗼𝗿 𝗘𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴-𝗯𝗮𝘀𝗲𝗱 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹. https://lnkd.in/gNQe7QhA The first is about how to intelligently choose a query-dependent cutoff on the cosine score to remove irrelevant items from the retrieved set. Since cosine scores aren't comparable across queries, we introduce a "Cosine Adapter" to map the scores to interpretable scores. 𝗘𝗻𝗵𝗮𝗻𝗰𝗶𝗻𝗴 𝗥𝗲𝗹𝗲𝘃𝗮𝗻𝗰𝗲 𝗼𝗳 𝗘𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴-𝗯𝗮𝘀𝗲𝗱 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 𝗮𝘁 𝗪𝗮𝗹𝗺𝗮𝗿𝘁. https://lnkd.in/gdqkc2hR The second is about model training techniques we learned to improve the relevance of embedding models. In particular, we describe how to use a teacher model to provide relevance labels for training Juexin (Joy) Lin Nicholas Rossi, PhD Feng LIU Zhen Yang Sachin Yadav Praveen Reddy Suram Satya Chembolu Prijith Chandran Hrushikesh Mohapatra Alessandro Magnani Ciya Liao
1411 Comment -
Intel Corporation
Unlock the full potential of your Large Language Models (LLMs) with Intel® Extension for PyTorch (IPEX) and the Intel® LLM Library for PyTorch (IPEX-LLM). Download this whitepaper to explore the optimization of LLMs, including their performance, resource utilization, and response times in real-world applications. Link - https://intel.ly/3BBJ4ey
1 -
Intel Corporation
Unlock the full potential of your Large Language Models (LLMs) with Intel® Extension for PyTorch (IPEX) and the Intel® LLM Library for PyTorch (IPEX-LLM). Download this whitepaper to explore the optimization of LLMs, including their performance, resource utilization, and response times in real-world applications. Link - https://intel.ly/4dwlqgQ
-
Intel Corporation
Unlock the full potential of your Large Language Models (LLMs) with Intel® Extension for PyTorch (IPEX) and the Intel® LLM Library for PyTorch (IPEX-LLM). Download this whitepaper to explore the optimization of LLMs, including their performance, resource utilization, and response times in real-world applications. Link - https://intel.ly/3BBTlHG
-
Madan Ram
Check out our latest post on Clappy.ai, "Leverage Layout-Aware RAG for More Accurate Retrieval of Knowledge from Documents." Discover how to harness the power of Retrieval-Augmented Generation (RAG) for documents in PDF, Doc, and Markdown formats. Dive into the specifics of implementation and explore our reference repository to get started with advanced RAG for your knowledge base. Don't miss out on this essential read! Continue reading here for all the details: The Power of Hierarchical Indexing with RAG. (https://lnkd.in/gbiVHfcH)
2 -
Taige Eoff
(MatMul) typically dominates the overall computational cost of large language models (In general, AB ≠ BA), fun huh? As you can imagine, this cost only grows as LLMs scale to larger embedding dimensions and context lengths. In this work, they show that MatMul operations can be completely eliminated from LLMs while maintaining strong performance at billion-parameter scales. Removing MatMul from the calculations in large language models means these models don't need powerful computers to run. This change allows them to work on simpler devices, like smaller servers or even some personal computers, making advanced AI tools available to more people and places. Hardware Optimization: Custom accelerators demonstrate the practicality of this method by processing billion-parameter models with just 13 watts of power. #whynividia
6 -
Rabia N.
I wanted to share some insights from a recent paper from Amazon Science in which authors tackled an important challenge in language modeling: minimizing sequence truncation. In language models, truncating input sequences might lead to a loss of crucial information, especially when the models are dealing with long-context tasks such as document summarization or story generation. The authors introduce a clever solution by leveraging a heuristic from the Best Fit Bin Packing problem, an NP-hard problem, to reorder input documents and reduce truncation. The proposed approach helps retain more information compared to other strategies by optimizing how sequences are packed into the model's context window. Authors show that the bin packing solution had biggest performance gain reading comprehension (+4.7%), natural-language inference (+9.3%), context following (+16.8%), and program synthesis. They also found that best-fit packing helped prevent closed-domain hallucination. Authors claim that while the experiments primarily focused on LLM pretraining, best-fit packing is broadly applicable to fine tuning as well. https://lnkd.in/gWquw3ud
283 Comments -
Zander Matheson
The latest article I have written is the second part of a series focused on breaking down streaming data for data scientists. As a data scientist for many years, I have encountered batch processes that made sense but were much harder to understand in a streaming context. Windowing is so important for stream processing as it allows us to break down continuous streams of data into discrete chunks that we can more easily reason about. Check out the second part of the series! I would love your feedback on how I can make the series better.
121 Comment -
ArunKumar R
LLMs are great at solving problems with data, but sometimes they need a little extra help to take action. That's where LLM agents come in. These powerful systems use the LLM as a reasoning engine to create a plan and execute it with a set of tools. An agent is made up of four components: the agent core, memory, tools, and planning module. The agent core is the decision-making module, while memory stores internal logs and user interactions. Tools are executable workflows, such as RAG pipelines or code interpreters, and planning modules break down tasks using LLMs. With these capabilities, agents can execute complex tasks like searching the internet or generating context-aware answers. So next time you need to take action with your data, consider using an LLM agent to get the job done efficiently and effectively. Langchain has great support for agents and the possibility of using agents will be more as the eco system matures. #LLM #LLMagents #GenAI
5 -
Jay Swartz
Can LLMs Avoid Costly Matrix Multiplication Overhead? I recently read an interesting paper that describes how to eliminate matrix multiplication (MatMul) from LLM training and inference processes. Scalable MatMul-free Language Modeling (https://lnkd.in/gpfxnKr3) compares standard self attention and MatMul-free alternatives. Reducing or eliminating the overhead of matrix multiplication will have a major impact on the cost of building and using AI. This also shows there is substantial room for improvement over pure GenAI solutions. I believe that the components for actual AGI are starting to be revealed. It could be some time before actual AGI emerges, but incredibly useful AI has already been created and is being deployed in controlled situations. The foundation model efforts to find AGI have provided powerful tools for coding and image/video production that can be immediately applied to an ever wider range of tasks. The discovery that these models can be made more cost effective will increase the number of implementations that will be possible.
111 Comment -
Viral Gupta
The quest for search relevance is tougher than it looks! With limited context in retail queries (think “bat”—baseball or animal?), relevance can be ambiguous. This is where fine-tuned LLMs like Llama3–8b make a huge impact, learning to capture nuances, interpret complex intent, and improve search precision at scale. Now, context-aware AI delivers faster, smarter results across millions of predictions daily, transforming search into an intuitive discovery tool for users everywhere. Exciting times ahead for LLM-powered search! 🌟 #AI #MachineLearning #SearchRelevance #Innovation #RetailTech More details in the blog https://lnkd.in/gvRS2Un4
14 -
Ravi Shankar
FlexAttention: Simplifying Attention in PyTorch _________________________________________________________________ FlexAttention is a new PyTorch API that combines the flexibility of PyTorch with the performance of optimized attention methods. It allows researchers to implement various attention mechanisms easily without needing to write custom kernels. Key Points: - Flexibility with Performance: -- Traditional attention methods offer high performance but limited flexibility. -- FlexAttention provides a flexible API that supports many attention variants with minimal effort. - How It Works: -- Standard Attention: Calculates attention scores using queries, keys, and values, normalizes them, and computes the weighted sum of values. -- FlexAttention: Allows for modifications to the attention scores through a flexible scoring function, enabling customization for different types of attention mechanisms. - Examples of Attention Variants: -- Relative Position Encoding: Adds positional information to attention scores based on relative positions. -- ALiBi Bias: Applies a bias to attention scores to account for position differences, enhancing model performance. -- Causal Mask: Masks future positions in the sequence to ensure the model only attends to previous positions, useful for autoregressive models. Handling Sparsity: -- BlockMask: Efficiently manages sparse attention masks, such as those used in causal attention, by creating masks that prevent unnecessary computations. Learn More: https://lnkd.in/g58pxB4A
8 -
Alon Faktor, PhD.
Hey, I'd like to give a shout-out to ClearML for helping us at Vimeo develop AI-based systems. We have been happy customers for a few years now and I'd like to share a bit about how we use ClearML. We use ClearML Datasets to cache a dataset of video transcripts, and run testing loading directly from ClearML Datasets. This allows us to simultaneously speed up the data loading and ensure consistency. Moreover we use ClearML to save our benchmark annotations in one place and track our system performance on the benchmark every-time we run an experiment or change our prompts. ClearML allows us to see the different parameters and prompts that were used for each experiment and to monitor improvements or regressions in our performance. We also use ClearML to run large-scale tests and help with statistical evaluation of our methods. For example, we developed a RAG (Retrieval Augmented Generation) Q&A system and wanted to verify that the LLM will not answer certain questions or user queries that are outside the scope of the video. We used ClearML to collect and analyze the RAG responses on many videos for predefined user queries that were outside the scope of the videos and got good visibility into the performance of the system. Also, the comparison feature on ClearML is great for tracking the improvement of our metrics along the progressing versions of our systems.
321 Comment
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More