Thinking about how to reduce your LLM costs / latency (or want to plan ahead)? We were asked about the merits of prompt engineering vs. PEFT for these purposes on a live #LLMs & AWS Q&A session we held a few weeks ago. Jared Burns discussed why you should view PEFT and prompt engineering as *complementary* to each other — not eithor/or. In the clip below, he talks about: 👉 Guidelines for designing good prompts 👉 Pros of PEFT 👉 Few-shot prompting 👉 How RAG can help lower your token count 👉 When to use default chunking vs. no-chunking in Amazon Bedrock Are you using either/both tactics already to? Let us know what your experiences have been with PEFT/prompt engineering in the comments 👇
DoiT’s Post
More Relevant Posts
-
Senior Lead Architect at J.P. Morgan | Ex-AWS | Cloud and Applications Architecture | Software Development and Engineering
Published Tech session 9 - Observability on Amazon ECS. https://lnkd.in/eAUpbewb
Observability on Amazon Elastic Container Service (ECS) | Tech session 9
https://www.youtube.com/
To view or add a comment, sign in
-
Working through STG312 workshop I learned a great deal about AI/ML workloads in depth. PyTorch connector is just another step forward to make your ML workloads done resist. #aws #s3 #pytorch
And here's a second really awesome launch today relating to the client experience with S3. The S3 connector for PyTorch lets torch applicaitons directly access S3 both for data loading and for writing out checkpoints. A really cool realization that came as we build this is that it's so good at shipping data onto the network that it actually achieves higher throughput than writing to a local instance SSD, because the physical connectivity out to the network is wider. There's a huge focus on the team right now on working directly to integrate S3 with clients, and this has been a really awesome example to see come together. https://lnkd.in/g8_8RSTF
To view or add a comment, sign in
-
🤔 🤔 🤔 Q: Did I just find out I left a medium sized GPU spun up on AWS and it's cost me $200 for doing nothing over past couple of weeks? A: Yes. Q: Did I do something similar while writing my book but in that case it was an MWAA environment and it cost me $150? A: Also, Yes. Q: Have I learned something? A: Yes? Q: Really? A: Sure, actually set your AWS budgets and anomaly detection properly. And just make sure to terminate stuff. You know, don't be dumb? #mlops #fail #costmanagement 🙃
To view or add a comment, sign in
-
Introducing file commit history in Amazon CodeCatalyst
Introducing file commit history in Amazon CodeCatalyst
aws.amazon.com
To view or add a comment, sign in
-
Do you follow the tradition of using Retries with exponential backoff? It's probably not such a good idea after all. Here is a great update from Re:Invent 2023.
AWS re:Invent 2023 - Surviving overloads: How Amazon Prime Day avoids congestion collapse (NET402)
https://www.youtube.com/
To view or add a comment, sign in
-
Compression plays a crucial role in #OpenSearch, because it impacts performance and storage efficiency. Zstandard compression strikes a nice balance, making it a great addition to OpenSearch 2.9 and Amazon OpenSearch Service. Many thanks to Akash Shankaran and the team at #Intel for collaborating with us to bring this exciting feature onboard. Check out this blog to learn more.
Optimize storage costs in Amazon OpenSearch Service using Zstandard compression | Amazon Web Services
aws.amazon.com
To view or add a comment, sign in
-
The golden (and surprising) rule of prompt engineering: “Show your prompt to a friend and ask them if they can follow the instructions and produce the results you are looking for.” Learn this and other useful prompting techniques next week during Elina Lesyk’s and my session „Prompt engineering best practices for LLMs on Amazon Bedrock“ (AIM302) at AWS Summit Berlin, May 15 + 16! https://lnkd.in/dXZEncU4
To view or add a comment, sign in
-
💡 It's clear that optimizing an ML model is key to high-performance inference, but the infrastructure used to serve that model can have an even greater impact on its performance in production. 🌐 Our co-founder Philip Howes broke down how globally distributed model serving infrastructure (both multi-cloud and multi-region) benefits availability, cost, redundancy, latency, and compliance. Check it out: https://lnkd.in/ene3pPVV
The benefits of globally distributed infrastructure for model serving
baseten.co
To view or add a comment, sign in
-
Pretty interesting read about the AWS s3 architecture that takes you through the history of magnetic disks and more. https://lnkd.in/eXM9NZZw
Building and operating a pretty big storage system called S3
allthingsdistributed.com
To view or add a comment, sign in
103,569 followers