LLMs On The Fly: Text-To-JSON For Custom API Calling

Uploaded by

contactcharliedev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views6 pages

LLMs On The Fly: Text-To-JSON For Custom API Calling

Uploaded by

contactcharliedev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

LLMs on the Fly: Text-to-JSON for Custom API Calling

Miguel Escarda-Fernández1 , Iñigo López-Riobóo-Botana1 , Santiago Barro-Tojeiro1 ,

Lara Padrón-Cousillas1 , Sonia Gonzalez-Vázquez1 , Antonio Carreiro-Alonso1,2 and
Pablo Gómez-Area1,2
1
Centro Tecnológico ITG, Cantón Grande 9, Planta 3, 15003, A Coruña, España
2
FlyThings® Technologies - IoT for Solutions 4.0, Cantón Grande 9, Planta 3, 15003, A Coruña, España

Abstract
In the rapidly evolving landscape of Natural Language Processing (NLP), there is a growing demand for agile and intuitive
tools due to the increasing model capabilities, primarily in the field of Large Language Models (LLMs). In recent months, we
have seen great progress in the Natural Language Generation (NLG) landscape, with proliferation of generative AI applications
leveraging LLMs for a vast number of tasks. The power of LLMs resides in their ability to generalize almost any NLP task to
the problem of next token prediction, thus simplifying the traditional NLP pipelines consisting in intensive data labeling and
domain-specific fine-tuning for a single task. Moreover, LLMs are enhanced (1) with external knowledge bases, which improve
their reasoning and domain understanding and (2) with external tools, which improve their ability to perform actions.
We present a novel approach that harnesses the power of LLMs to transform natural language inputs into structured
data representations, facilitating seamless interaction with custom APIs for real-time data visualization. We explore the
integration of Flythings® Technologies API for Internet of Things (IoT) device solutions in the Industry 4.0 domain. This
system demonstration presents a chat-based virtual assistant that allows users to query the status of monitored machines
and devices. The core component of the application is a LLM that serves as a bridge between user queries and machine-
readable JSON objects, which adhere to a predefined schema following the Flythings standard. Our LLM output facilitates the
interaction with the Flythings API, leading to the generation of visualizations that illustrate IoT device status in real time.

Keywords
NLP, LLM, Fine-tuning, agents, assistants, visualization, API tools, IoT, Monitoring, Industry 4.0

1. Introduction over, production-ready systems using LLMs require less

time than the traditional approaches [2], thanks to the
Undoubtedly, Large Language Models (LLMs) are here generalization of almost any NLP task to the problem of
to stay. Their emergence has marked the beginning of Causal Language Modeling (CLM) [3, 4, 5, 6]. By provid-
an era in which natural language can be used to perform ing such text interface, we communicate with machines
any Natural Language Processing (NLP) task following in a natural way, allowing for intuitive interactions.
prompting techniques [1], which normally required large In this context, our system demonstration paper in-
amount of labeled datasets for fine-tuning ad hoc models troduces a novel application that leverages the strengths
for a single task. LLMs, with their remarkable language of LLMs in industrial monitoring and Internet of Things
understanding and generation capabilities, have the po- (IoT) domains. We integrate our fine-tuned LLMs in the
tential to dramatically ease the continuous iteration of Flythings® Technologies platform1 . Our chat-based ap-
common NLP workflows including, but not limited to, plication allows users to submit queries using natural
data labeling, data augmentation or fine-tuning. More- language. Then, these are processed by our model, whose
task is to extract the relevant information from the in-
SEPLN-CEDI-PD 2024: Seminar of the Spanish Society for Natural put query by identifying and mapping the Flythings API
Language Processing: Projects and System Demonstrations, June input fields, generating a well-formed JSON output fol-
19-20, 2024, A Coruña, Spain. lowing a specific schema. This enables us to easily inter-
$ [email protected] (M. Escarda-Fernández); [email protected] act with the Flythings API, sending the corresponding
(I. López-Riobóo-Botana); [email protected] (S. Barro-Tojeiro);
[email protected] (L. Padrón-Cousillas); [email protected]
JSON objects to their services for real-time monitoring
(S. Gonzalez-Vázquez); [email protected] (A. Carreiro-Alonso); and visualization of IoT device details.
[email protected] (P. Gómez-Area) In Section 2, we conduct a comprehensive review of the
https://www.linkedin.com/in/%C3%AD%C3%B1igo-luis-l%C3% existing research, establishing a solid foundation and con-
B3pez-riob%C3%B3o-botana-4a43001a2/ (I. López-Riobóo-Botana); text for our work. In Section 3, we present our pipeline
https://www.linkedin.com/in/phd-sonia-gonz%C3%A1lez-v%C3%
A1zquez-38b14a8b/ (S. Gonzalez-Vázquez)
and infrastructure, describing the design details and all
0000-0002-9080-1535 (M. Escarda-Fernández); the steps involved, including the data augmentation, fine-
0000-0002-7310-0702 (I. López-Riobóo-Botana);
1
0009-0006-2782-2567 (S. Gonzalez-Vázquez) You have a brief description of the FlyThings® Technolo-
© 2024 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
gies services in the Appendix A, check at https://itg.es/en/
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org) monitoring-iot-platform-flythings/ for additional information.

CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
tuning and deployment of the optimized and production- 3. Proposed Method
ready LLM. In Section 4, we illustrate the practical ex-
amples carried out and the real world utility of our tool, In this section, we present our methodology, covering
presenting its limitations in Section 5. We conclude with all the steps involved in our pipeline. We describe our
Section 6 by summarizing our findings and outlining the data preparation stage, including the seed data creation
future directions of our research. and data augmentation process. We also formulate our
supervised fine-tuning (SFT) method for our information
extraction task, as well as the inference optimizations
2. Related Work taken into account for our LLM deployment. The overall
process is depicted in Figure 1.
In recent months, we have seen a myriad of LLM re-
search papers addressing the topic of context-aware
LLMs through in-context learning. This capability en- 3.1. Seed Data
ables them to generalize to almost any NLP task, com- In the absence of pre-existing user data for our task, de-
monly unseen during pre-training and fine-tuning stages pendent on the FlyThings® technology, we started creat-
[3, 5, 6]. This direction has led the research commu- ing a dataset. We collected feedback from the Flythings
nity to explore the integration of LLMs with external team, who provided us with the initial examples of poten-
tools such as document stores [7] or APIs [8], enhancing tial user inputs and expected outputs. In this way, we got
their generalization capabilities even more. LLM agents a seed dataset consisting of 6 outputs, each of them with 3
[9] are a new concept arised from providing LLMs with different ways of expressing the input in accordance with
(1) extensive up-to-date data pools beyond their fixed the Flythings team. Given these pairs, we agreed on a
knowledge representations and (2) functions or tools to specification, defining a JSON schema as the golden rule.
perform actions and automate processes [10, 11, 12, 13]. Our pipeline starts with (1) a template-based method for
Such two-fold strategy reduces the need for regular re- generating new JSON outputs as described in Figure 1,
training. For example, Gorilla [8] leverages a multitude randomly selecting one of the available options for each
of APIs and documentation through document retrievers, of the JSON fields, following the schema depicted in Fig-
highlighting the effectiveness of this framework. ure 2. In this way, we got a pool of examples for the next
Moreover, the reasoning capabilities of LLMs are in- data augmentation step.
fluenced by the prompt strategies followed [5, 14, 15],
where how natural language instructions are written
significantly affects the performance [16]. More com-
3.2. Data Augmentation
plex prompting strategies like ReAct [9] became popular, Our seed dataset was scarce and limited in scope, lacking
combining reasoning and planning techniques by adding from input query diversity. Therefore, we followed a data
reasoning traces and task-specific actions to the prompt. augmentation approach. We created a custom pipeline for
These strategies benefit the integration of the LLM with generating alternative input queries, given the reference
external sources. In this new landscape, new benchmark (input, output) pairs from the seed data. For this task,
frameworks were proposed [17, 18], which aim at design- we leveraged the Mixture of Experts (MoE) LLM Mixtral-
ing reliable and robust evaluation methodologies. 8x7B-Instruct-v0.1 model from Mistral AI [24].
The introduction of Generative Information Extraction We aimed at generating variant inputs for each JSON
(GIE) has further boosted the NLP field [19]. Recent stud- output from the previous pool depicted in Figure 1, so that
ies [20] propose LLMs to generate structured information we could increase the available (input, output) pairs. We
from natural language. Some closely-related tasks, like used the original seed as reference within the instruction
text-to-SQL [21, 22], involve the transformation of nat- illustrated in Figure 3, generating 3 variations of the input
ural language into SQL language for querying external for each target through few-shot in-context learning [6].
tools (i.e., databases). This generative approach proves to This process corresponds to the (2) data augmentation
be effective even in scenarios involving complex schemas step depicted in Figure 1. We increased our dataset up to
with millions of entities involved [23]. The ability of 355 curated samples for the following SFT stage.
LLMs to manage these large schemas without dropping
performance (effectively generating the target query fol-
3.3. Supervised Fine-Tuning
lowing a specific format) is particularly significant for
our research. We propose a generation step aiming at Before diving into the details of the fine-tuning process,
transforming natural language queries (sent to our virtual it is important to understand why supervised fine-tuning
assistant) into structured JSON objects with the relevant was necessary in the first place. While zero-shot or few-
parameters for the integration of the FlyThings® API. shot (i.e., in-context) learning [25] can be effective for
general NLP tasks, it entails challenges when the task
1 Instruction Input-Output
1 Json schema (1) Output Pool (2)
Task Pool

Instruction: Your task is to generate in Spanish 3

alternative inputs for a specific JSON output (...)
Output LLM This is the output schema: {json_schema}
generator

JSON
(5) (4) (3)
AWQ Supervised

Inference Quantization Finetuning

Figure 1: Our pipeline begins with the design of the JSON schema with the formatting rules, used as the specification for (1)
a template-based method for the generation of random JSON output targets for our task. These outputs are fed into (2) a data
augmentation phase utilizing a LLM to generate multiple inputs corresponding to each previously generated JSON output,
so that we add diversity to how users convey queries. Subsequently, (3) the supervised fine-tuning task for our information
extraction task, (4) the quantization stage for the model inference optimization and (5) the deployment phase culminating
with the integration of the FlyThings® endpoint for the creation of a virtual assistant enhanced with visualizations.

{ Json Schema Instruction: Your task is to generate 3 alternative inputs for a

"series":[ specific JSON output. {rules_to_follow}
{ This is the output schema:
"property": String, {"series": [{ "property": "tap 2", "foi": "greehouse water",
"foi": String, "asIncremental": True }], "visualization": {"config": {"type":
"module": String, "chart", "subtype": "line"}, "body":{ "temporalScale": "DAILY",
"temporalScaleType": "CHANGES" }}}
"asIncremental": Boolean
}
], Input1: View the accumulated status changes for tap 2 of the
greenhouse water device on a daily graph.
"visualization":{
Input2: Observe the daily graph that displays the collective
"config":{
status alterations of tap 2 in the greenhouse watering device.
"type": Enum, Input3: Examine the daily chart showing the aggregate
"subtype": Enum
changes in the status of greenhouse water device's tap 2.
},
"body":{
"period": Enum, Figure 3: The few-shot data augmentation task. We designed
(...) the following prompt: (1) the system instruction (displayed
(...)
"temporalScaleType": Enum in black), including the rules (in bold curly brackets) with the
} seed pairs as reference guiding the generation with few-shot
} examples (omitted for clarity). Then, we present (2) one output
}
from the pool as the target (highlighted in blue) and (3) we
Figure 2: Overview of the JSON schema used for output vali- generate three new input queries (highlighted in green).
dation. Notice that, according to this specification, each JSON
output will have two main parts: (1) the series field, which
includes information about the specific Flythings IoT devices
with some examples of the task in the initial instruction,
been queried and (2) the visualization properties, which in-
was limited and biased by the quality and expressiveness
clude the required information for the visual representation
of the series data in the virtual assistant.
of the provided sequences at inference time. In short,
these two methods neither captured the complexity nor
the specificity of our domain, leading us to sub-optimal
performance in terms of both accuracy and reliability.
is very specific and requires a thorough generation pro- Recognized these limitations, we transitioned to a fine-
cess, limiting hallucinations [26]. In our case, we faced tuning approach to tailor the model for our specific needs.
some issues with the in-context learning approach for During the fine-tuning stage, we assessed multiple mod-
classifying and extracting the corresponding fields for els up to 7 billion parameters, considering the trade-
the Flythings® task. On the one hand, (1) zero-shot learn- off between the model performance and our hardware
ing, which involves making direct predictions without limitations. We finally chose the instruction fine-tuned
any previous examples in the training distribution, had model teknium/OpenHermes-2.5-Mistral-7B2 based on the
problems with detailed input queries requiring complex mistralai/Mistral-7B-Instruct-v0.1 model3 . We leveraged
JSON outputs, in which the corresponding JSON schema the dataset from our previous data augmentation step
in the instruction was not enough. These led to classifi-
cation inaccuracies in the generation step. Similarly, (2) 2
https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B
few-shot learning, which relies on providing the model 3
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
from Section 3.2, following the QLoRA [27] approach the visual widget is loaded, showing the results to the
for efficient fine-tuning. Similar to LoRA (Low-Rank user. We include an example in Figure 4. We also provide
Adaptation of large language models) [28], which freezes a video demonstration6 of the virtual assistant.
the pre-trained model weights and adds trainable rank
decomposition matrices to each transformer block (elim-
inating the need for full fine-tuning), QLoRA goes a step 5. Limitations
further by quantizing the weights of the frozen backbone
In this paper we introduce the first version of the sys-
LLM, adding the LoRA adapters with paged optimizers to
tem as a proof-of-concept demo, still in its early stage
manage memory spikes. This results in a more efficient
of development. We focused on the data augmentation,
memory management for fine-tuning [27].
fine-tuning and deployment stages mainly due to time
constraints. We did not perform thorough evaluation
3.4. Inference Optimization and we acknowledge the importance of this process, but
After the supervised fine-tuning stage of our model, we since the project is linked to a new market product by the
had to determine the inference requirements under a Flythings® company, we aligned with the team require-
production environment, considering (1) our hardware ments, which were more oriented to fast prototyping for
limitations and (2) the need for low latency supporting a first usable version of the chat interface.
real-time queries. In this way, we explored the available
options for reducing the computational requirements, 6. Conclusions and Future Work
while maintaining (or minimally decreasing) the LLM
performance. We opted for the vLLM [29] library, specif- In this paper we present a novel approach for query-
ically designed for fast and efficient serving of LLMs in- ing the Flythings® framework. We described the system
cluding, but not limited to, paged attention optimizations, architecture and the NLP pipeline for the dataset prepara-
continuous batching of incoming requests and optimized tion, LLM fine-tuning and inference optimization stages.
CUDA kernels. We compared the performance of differ- Our approach is generalizable to any text-to-JSON or text-
ent quantization techniques supported by vLLM, such to-API task following the proposed pipeline. We handle
as GPTQ [30] and AWQ [31]. We chose AWQ because it user queries in natural language with a virtual assistant,
offered the best throughput while maintaining the perfor- considering visual feedback. Our next steps include re-
mance4 . We deployed our LLM service in the proprietary fining the fine-tuned LLM using preference data from
ITG clusters, using a RTX A6000 48 GB GDDR6 GPU. users interacting with the system. We will study in more
detail both the helpfulness and the accuracy of our model
outputs by means of thorough evaluation and benchmark-
4. Chatbot Experimentation ing. We plan to explore Reinforcement Learning from
For our experimentation, we implemented a new vir- Human Feedback (RLHF) [32] and Directed Preference
tual assistant view in the FlyThings® framework. The Optimization (DPO) [33] for further alignment with hu-
front-end of the chatbot is in charge of loading the user man preferences. We also foresee future applications of
contexts, which is the list of their IoT devices available. Virtual Reality (VR), which would improve usability un-
With the environment all set, each input query is sent der real conditions and enhance user experience. We aim
to the LLM service, which generates the corresponding to broaden the current functionality beyond ®
querying IoT
JSON output following the schema described in Figure 2. devices, adding more complex Flythings IoT operations,
We identify the closest IoT device information matching such as managing device actions, alerts or dashboards.
the extracted device and property (and optionally module,
if present) JSON fields. Then, we follow these steps: (1) if Acknowledgments
there are no matches, the user is prompted to try again;
(2) if there is exclusively one match, the next step is exe- This ongoing R&D project is supported by the CEL.IA
cuted; (3) if there are more than one match, a radio button network initiative7 through the CDTI (Centro para el De-
is displayed for the user to choose among them. Depend- sarrollo Tecnológico Industrial) (grant CER-20211022) by
ing on the visualization format (graph, table, indicator the Ministerio de Ciencia e Innovación. This research is
and so on), a request to the observation API endpoints5 is also possible thanks to the ITG-Flythings collaboration.
processed, including all the chart configuration. Finally, We would like to express our gratitude to the Flythings
4
The AWQ quantization method consistently outperforms
GPTQ across different model scales in their evaluation benchmark.
Check the original work for more details.
5 6
https://deviot.flythings.io/api/apidocs/index.html# Demo (video) available at https://youtu.be/qHs47rcmpHU
7
api-03-Request_Observations https://itg.es/cervera-celia/
Figure 4: An end-to-end example of the FlyThings® virtual assistant, integrating the LLM service with the API services.

developers team, for their continuous support and feed- Large Language Model Connected with Massive
back to enhance our LLM generation capabilities and APIs, arXiv preprint arXiv:2305.15334 (2023).
integration within their systems. [9] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran,
K. Narasimhan, Y. Cao, React: Synergizing reason-
ing and acting in language models, arXiv preprint
References arXiv:2210.03629 (2022).
[10] A. Parisi, Y. Zhao, N. Fiedel, TALM: Tool Aug-
[1] S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke,
mented Language Models, ArXiv abs/2205.12255
E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y. Li, S. Lund-
(2022). URL: https://api.semanticscholar.org/
berg, et al., Sparks of artificial general intelli-
CorpusID:249017698.
gence: Early experiments with gpt-4, arXiv preprint
[11] T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu,
arXiv:2303.12712 (2023).
M. Lomeli, L. Zettlemoyer, N. Cancedda, T. Scialom,
[2] A. Kulkarni, A. Shivananda, A. Kulkarni, D. Gu-
Toolformer: Language models can teach themselves
divada, LLMs for Enterprise and LLMOps, Apress,
to use tools, arXiv preprint arXiv:2302.04761 (2023).
Berkeley, CA, 2023, pp. 117–154. URL: https://doi.
[12] R. Nakano, J. Hilton, S. Balaji, J. Wu, L. Ouyang,
org/10.1007/978-1-4842-9994-4_7. doi:10.1007/
C. Kim, C. Hesse, S. Jain, V. Kosaraju, W. Saunders,
978-1-4842-9994-4_7.
X. Jiang, K. Cobbe, T. Eloundou, G. Krueger, K. But-
[3] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei,
ton, M. Knight, B. Chess, J. Schulman, WebGPT:
I. Sutskever, Language Models are Unsuper-
Browser-assisted question-answering with human
vised Multitask Learners, 2019. URL: https://api.
feedback, CoRR abs/2112.09332 (2021). URL: https:
semanticscholar.org/CorpusID:160025533.
//arxiv.org/abs/2112.09332. arXiv:2112.09332.
[4] J. Wei, M. Bosma, V. Zhao, K. Guu, A. W. Yu,
[13] S. Yao, R. Rao, M. Hausknecht, K. Narasimhan,
B. Lester, N. Du, A. M. Dai, Q. V. Le, Fine-
Keep CALM and explore: Language models for
tuned Language Models Are Zero-Shot Learn-
action generation in text-based games, in:
ers, ArXiv abs/2109.01652 (2021). URL: https://api.
B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Pro-
semanticscholar.org/CorpusID:237416585.
ceedings of the 2020 Conference on Empiri-
[5] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, Y. Iwa-
cal Methods in Natural Language Processing
sawa, Large Language Models are Zero-Shot Rea-
(EMNLP), Association for Computational Linguis-
soners, ArXiv abs/2205.11916 (2022). URL: https:
tics, Online, 2020, pp. 8736–8754. URL: https:
//api.semanticscholar.org/CorpusID:249017743.
//aclanthology.org/2020.emnlp-main.704. doi:10.
[6] D. Dai, Y. Sun, L. Dong, Y. Hao, S. Ma, Z. Sui, F. Wei,
18653/v1/2020.emnlp-main.704.
Why Can GPT Learn In-Context? Language Mod-
[14] J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. H.
els Implicitly Perform Gradient Descent as Meta-
hsin Chi, F. Xia, Q. Le, D. Zhou, Chain of Thought
Optimizers (2023). arXiv:2212.10559.
Prompting Elicits Reasoning in Large Language
[7] P. Lewis, E. Perez, A. Piktus, F. Petroni,
Models, ArXiv abs/2201.11903 (2022). URL: https:
V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t.
//api.semanticscholar.org/CorpusID:246411621.
Yih, T. Rocktäschel, et al., Retrieval-augmented
[15] W. Huang, P. Abbeel, D. Pathak, I. Mordatch, Lan-
generation for knowledge-intensive nlp tasks,
guage Models as Zero-Shot Planners: Extracting
Advances in Neural Information Processing
Actionable Knowledge for Embodied Agents, CoRR
Systems 33 (2020) 9459–9474.
abs/2201.07207 (2022). URL: https://arxiv.org/abs/
[8] S. G. Patil, T. Zhang, X. Wang, J. E. Gonzalez, Gorilla:
2201.07207. arXiv:2201.07207. A Survey on Hallucination in Large Language Mod-
[16] Anthropic, Long context prompting for claude els: Principles, Taxonomy, Challenges, and Open
2.1, 2023. URL: https://www.anthropic.com/news/ Questions, ArXiv abs/2311.05232 (2023). URL: https:
claude-2-1-prompting. //api.semanticscholar.org/CorpusID:265067168.
[17] Q. Xu, F. Hong, B. Li, C. Hu, Z. Chen, J. Zhang, [27] T. Dettmers, A. Pagnoni, A. Holtzman, L. Zettle-
On the Tool Manipulation Capability of Open- moyer, QLoRA: Efficient Finetuning of Quantized
source Large Language Models, arXiv preprint LLMs, ArXiv abs/2305.14314 (2023). URL: https:
arXiv:2305.16504 (2023). //api.semanticscholar.org/CorpusID:258841328.
[18] Y. Qin, S. Liang, Y. Ye, K. Zhu, L. Yan, Y. Lu, Y. Lin, [28] E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y. Li,
X. Cong, X. Tang, B. Qian, et al., Toolllm: Facilitat- S. Wang, L. Wang, W. Chen, LoRA: Low-Rank
ing large language models to master 16000+ real- Adaptation of Large Language Models, in: In-
world apis, arXiv preprint arXiv:2307.16789 (2023). ternational Conference on Learning Representa-
[19] D. Xu, W. Chen, W. Peng, C. Zhang, T. Xu, X. Zhao, tions, 2022. URL: https://openreview.net/forum?id=
X. Wu, Y. Zheng, E. Chen, Large Language Mod- nZeVKeeFYf9.
els for Generative Information Extraction: A Sur- [29] W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H.
vey, ArXiv abs/2312.17617 (2023). URL: https://api. Yu, J. E. Gonzalez, H. Zhang, I. Stoica, Efficient
semanticscholar.org/CorpusID:266690657. Memory Management for Large Language Model
[20] A. Dunn, J. Dagdelen, N. Walker, S. Lee, A. S. Serving with PagedAttention, in: Proceedings of
Rosen, G. Ceder, K. Persson, A. Jain, Structured the ACM SIGOPS 29th Symposium on Operating
information extraction from complex scientific Systems Principles, 2023.
text with fine-tuned large language models, 2022. [30] E. Frantar, S. Ashkboos, T. Hoefler, D. Alistarh,
arXiv:2212.05238. GPTQ: Accurate Post-training Compression for
[21] J. Li, B. Hui, G. Qu, J. Yang, B. Li, B. Li, B. Wang, Generative Pretrained Transformers, arXiv preprint
B. Qin, R. Cao, R. Geng, N. Huo, X. Zhou, C. Ma, arXiv:2210.17323 (2022).
G. Li, K. C. C. Chang, F. Huang, R. Cheng, Y. Li, [31] J. Lin, J. Tang, H. Tang, S. Yang, X. Dang, S. Han,
Can LLM Already Serve as A Database Interface? AWQ: Activation-aware Weight Quantization for
A BIg Bench for Large-Scale Database Grounded LLM Compression and Acceleration, arXiv (2023).
Text-to-SQLs, 2023. arXiv:2305.03111. [32] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wain-
[22] R. Srivastava, Defog SQLCoder, 2023. URL: https: wright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama,
//github.com/defog-ai/sqlcoder. A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller,
[23] M. Josifoski, N. De Cao, M. Peyrard, F. Petroni, M. Simens, A. Askell, P. Welinder, P. Christiano,
R. West, GenIE: Generative information extraction, J. Leike, R. Lowe, Training language models to
in: M. Carpuat, M.-C. de Marneffe, I. V. Meza Ruiz follow instructions with human feedback, 2022.
(Eds.), Proceedings of the 2022 Conference of the arXiv:2203.02155.
North American Chapter of the Association for [33] R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D.
Computational Linguistics: Human Language Tech- Manning, C. Finn, Direct Preference Optimization:
nologies, Association for Computational Linguis- Your Language Model is Secretly a Reward Model,
tics, Seattle, United States, 2022, pp. 4626–4643. 2023. arXiv:2305.18290.
URL: https://aclanthology.org/2022.naacl-main.342.
doi:10.18653/v1/2022.naacl-main.342.
[24] Mistral AI, Mixtral of experts, 2023. A. Flythings
https://mistral.ai/news/mixtral-of-experts/
and https://huggingface.co/mistralai/ The FlyThings® platform is an all-in-one tool for IoT
Mixtral-8x7B-Instruct-v0.1. device management for many different productive sec-
[25] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, tors. It is designed for the analysis and forecasting of
J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, data records of IoT devices, considering any of the data
G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, types available at scale. FlyThings® handles a wide va-
G. Krueger, T. Henighan, R. Child, A. Ramesh, riety of sensors, systems and applications for specific
D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, use cases including, but not limited to, smart indus-
E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, tries or intelligent energy. FlyThings® helps in the de-
C. Berner, S. McCandlish, A. Radford, I. Sutskever, cision making process, yielding better results for en-
D. Amodei, Language Models are Few-Shot Learn- terprises, with ad hoc offerings including modular Big
ers, 2020. arXiv:2005.14165. Data as a Service (BDaaS) with standard APIs for data
[26] L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, management and visualization. Check https://itg.es/en/
H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, T. Liu, monitoring-iot-platform-flythings/ for more details.

(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
100% (14)
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
132 pages
Sinan Ozdemir Quick Start Guide To Large Language Models Strategies
No ratings yet
Sinan Ozdemir Quick Start Guide To Large Language Models Strategies
285 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
100% (5)
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
326 pages
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
100% (3)
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
275 pages
NLP Materia
No ratings yet
NLP Materia
29 pages
1.machine Learning and Its Applications
No ratings yet
1.machine Learning and Its Applications
75 pages
AI - Machine Learning Engineer Handbook
No ratings yet
AI - Machine Learning Engineer Handbook
136 pages
Integrating Large Language Models With Internet of Things Applications
No ratings yet
Integrating Large Language Models With Internet of Things Applications
22 pages
Machine Learning Lecture - 2 and Lecture - 3
No ratings yet
Machine Learning Lecture - 2 and Lecture - 3
59 pages
$RHZJ1VK
No ratings yet
$RHZJ1VK
15 pages
A Prompt Pattern Catalog To Enhance Prompt Engineering
No ratings yet
A Prompt Pattern Catalog To Enhance Prompt Engineering
31 pages
03 NLP Document
No ratings yet
03 NLP Document
38 pages
Raj Kumar Thesis - Final
No ratings yet
Raj Kumar Thesis - Final
29 pages
10.2478 - Picbe 2024 0018
No ratings yet
10.2478 - Picbe 2024 0018
14 pages
ChatGPT in The Age of Generative AI and Large Lang
No ratings yet
ChatGPT in The Age of Generative AI and Large Lang
60 pages
Tesis Master Ramon Martinez Jimenez
No ratings yet
Tesis Master Ramon Martinez Jimenez
80 pages
Wirelessllm: Empowering Large Language Models Towards Wireless Intelligence
No ratings yet
Wirelessllm: Empowering Large Language Models Towards Wireless Intelligence
12 pages
Hugginggpt: Solving Ai Tasks With Chatgpt and Its Friends in Hugging Face
No ratings yet
Hugginggpt: Solving Ai Tasks With Chatgpt and Its Friends in Hugging Face
25 pages
Chatbot On Videos
No ratings yet
Chatbot On Videos
17 pages
BERT Model
No ratings yet
BERT Model
69 pages
Large Language Models (LLMS) For Semantic Communication in Edge-Based Iot Networks
No ratings yet
Large Language Models (LLMS) For Semantic Communication in Edge-Based Iot Networks
7 pages
Smart Conversations Enhancing User Engagement Through NLP in IoT Environments
No ratings yet
Smart Conversations Enhancing User Engagement Through NLP in IoT Environments
6 pages
Perspective Large Languagemodels in Applied Mechanics
No ratings yet
Perspective Large Languagemodels in Applied Mechanics
7 pages
Llmind Complextasking
No ratings yet
Llmind Complextasking
7 pages
NLP - PBL - Project Report - Draft.02
No ratings yet
NLP - PBL - Project Report - Draft.02
32 pages
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
100% (1)
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
51 pages
Horvath Final Documentation WS18
No ratings yet
Horvath Final Documentation WS18
43 pages
Chat GPT For Hacking 230130 133243
100% (1)
Chat GPT For Hacking 230130 133243
43 pages
Llms With Industrial Lens: Deciphering The Challenges and Prospects - A Survey
No ratings yet
Llms With Industrial Lens: Deciphering The Challenges and Prospects - A Survey
25 pages
Welcome To The Era of ChatGPT Et Al.
No ratings yet
Welcome To The Era of ChatGPT Et Al.
7 pages
Large Language Models and Where To Use Them - Part 2
No ratings yet
Large Language Models and Where To Use Them - Part 2
12 pages
Augmenting LLMs Survey
No ratings yet
Augmenting LLMs Survey
33 pages
A Python Based Virtual Assistant Using Raspberry Pi For Home Automation
No ratings yet
A Python Based Virtual Assistant Using Raspberry Pi For Home Automation
6 pages
Large Language Models A Comprehensive Survey of It
No ratings yet
Large Language Models A Comprehensive Survey of It
30 pages
Virtual Agent Chatbot Using Open Artificial Intelligence Final
No ratings yet
Virtual Agent Chatbot Using Open Artificial Intelligence Final
16 pages
Implementation of Simple and Efficient P
No ratings yet
Implementation of Simple and Efficient P
8 pages
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
No ratings yet
REPORT-MTechPESJul23BGrp2-3 (22-02-25)
15 pages
Module 2
No ratings yet
Module 2
101 pages
1 s2.0 S016649722400052X Main
No ratings yet
1 s2.0 S016649722400052X Main
11 pages
U M L T B A S - I BOT: Sing Achine Earning O Uild EMI Ntelligent
No ratings yet
U M L T B A S - I BOT: Sing Achine Earning O Uild EMI Ntelligent
17 pages
An AI-Driven Interactive Chatbot: A Well-Trained Chatbot That Communicates With The Users and Reduces The Manual Interaction
No ratings yet
An AI-Driven Interactive Chatbot: A Well-Trained Chatbot That Communicates With The Users and Reduces The Manual Interaction
8 pages
Large Language Models and Where To Use Them - Part 1
No ratings yet
Large Language Models and Where To Use Them - Part 1
12 pages
Sign Language Translator
100% (1)
Sign Language Translator
4 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
12 pages
R22M.Tech - CSE CSSyllabus
No ratings yet
R22M.Tech - CSE CSSyllabus
49 pages
LLM and IoT
No ratings yet
LLM and IoT
4 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
487 pages
Text To Web Application Using LLM
No ratings yet
Text To Web Application Using LLM
4 pages
Natural Language Processing
No ratings yet
Natural Language Processing
8 pages
HubSpot and Motion AI (B) Generative AI Opportunities
No ratings yet
HubSpot and Motion AI (B) Generative AI Opportunities
10 pages
(IJIT-V6I5P7) :ravishankar Belkunde
No ratings yet
(IJIT-V6I5P7) :ravishankar Belkunde
9 pages
Ai Icn 13-Oct-2023
No ratings yet
Ai Icn 13-Oct-2023
1 page
Image To Caption Generator
No ratings yet
Image To Caption Generator
7 pages
Automatic Speech Recognition Thesis
100% (3)
Automatic Speech Recognition Thesis
7 pages
Frontiers in Artificial Intelligence Research Vol. 01 No. 03 (2024)
No ratings yet
Frontiers in Artificial Intelligence Research Vol. 01 No. 03 (2024)
39 pages
Development of An Indian Legal Language Model (LLM) For Enhanced Legal Text Analysis and Assistance
No ratings yet
Development of An Indian Legal Language Model (LLM) For Enhanced Legal Text Analysis and Assistance
7 pages
CL Revison
No ratings yet
CL Revison
47 pages
Thesis
No ratings yet
Thesis
154 pages
Edith PPT
No ratings yet
Edith PPT
22 pages
Resum (1) (3) Pro
No ratings yet
Resum (1) (3) Pro
16 pages
Installation Guide For Telegram Bot.V2.1 - Header
No ratings yet
Installation Guide For Telegram Bot.V2.1 - Header
27 pages
A Comprehensive Survey On Process-Oriented Automatic Text Summarization With Exploration of LLM-Based Methods
No ratings yet
A Comprehensive Survey On Process-Oriented Automatic Text Summarization With Exploration of LLM-Based Methods
20 pages
Final Edit 2
No ratings yet
Final Edit 2
19 pages
Multi Modal Emotion and Cause
No ratings yet
Multi Modal Emotion and Cause
17 pages
Black and White Modern Artificial Intelligence Presentation
No ratings yet
Black and White Modern Artificial Intelligence Presentation
10 pages
NavigatingtheFuture AI DrivenProjectManagementintheDigitalEra
No ratings yet
NavigatingtheFuture AI DrivenProjectManagementintheDigitalEra
12 pages
P.S.V College of Engineering and Technology Summer Internship Python Development For Classifying Toxic Comments Using Natural Language Processing
No ratings yet
P.S.V College of Engineering and Technology Summer Internship Python Development For Classifying Toxic Comments Using Natural Language Processing
10 pages
Meeting Insights Summarisation Using Speech Recognition
No ratings yet
Meeting Insights Summarisation Using Speech Recognition
8 pages
Azure AI Engineer Associate
No ratings yet
Azure AI Engineer Associate
4 pages
Human-Computer Interaction Through Hand Gesture Recognition and Voice Commands
No ratings yet
Human-Computer Interaction Through Hand Gesture Recognition and Voice Commands
10 pages
Text Generation:Use Technique Like Markov Models or LSTM Network To Generate Realistic Text in A Specific Style or Genre
No ratings yet
Text Generation:Use Technique Like Markov Models or LSTM Network To Generate Realistic Text in A Specific Style or Genre
7 pages
CS158 1 Reviewer
No ratings yet
CS158 1 Reviewer
8 pages
Research Paper Outline
No ratings yet
Research Paper Outline
4 pages
Jahnavi Resume
No ratings yet
Jahnavi Resume
1 page
cs626 460 Midsem 2012 02 20 PDF
No ratings yet
cs626 460 Midsem 2012 02 20 PDF
1 page
The Algorithmic Analyst: Mastering NLP For Modern Intelligence
From Everand
The Algorithmic Analyst: Mastering NLP For Modern Intelligence
Zhao Xintong
No ratings yet
A Greater Foundation for Machine Learning Engineering: The Hallmarks of the Great Beyond in Pytorch, R, Tensorflow, and Python
From Everand
A Greater Foundation for Machine Learning Engineering: The Hallmarks of the Great Beyond in Pytorch, R, Tensorflow, and Python
Dr. Ganapathi Pulipaka
No ratings yet
Introduction to TinyML
From Everand
Introduction to TinyML
Rohit Sharma
5/5 (1)
Developing Apps with Python and Flet
From Everand
Developing Apps with Python and Flet
Williams Asiedu
No ratings yet
Mastering Deepseek in Python: A Complete Guide to Building, Training, Deploying, and Scaling Advanced NLP Applications with Deepseek Models in Python
From Everand
Mastering Deepseek in Python: A Complete Guide to Building, Training, Deploying, and Scaling Advanced NLP Applications with Deepseek Models in Python
Dargslan
No ratings yet
Deep Learning
From Everand
Deep Learning
Manish Soni
No ratings yet
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
From Everand
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
Chitra Lele
No ratings yet
Learn IoT Programming Using Node-RED: Begin to Code Full Stack IoT Apps and Edge Devices with Raspberry Pi, NodeJS, and Grafana
From Everand
Learn IoT Programming Using Node-RED: Begin to Code Full Stack IoT Apps and Edge Devices with Raspberry Pi, NodeJS, and Grafana
Bernardo Ronquillo Japón
No ratings yet
Language Understanding with LUIS: Definitive Reference for Developers and Engineers
From Everand
Language Understanding with LUIS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Unveiling the Secrets of ChatGPT Inside the Mind of an AI
From Everand
Unveiling the Secrets of ChatGPT Inside the Mind of an AI
Nelson Ambrose
No ratings yet
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
A Guide to Python Mastery: Python
From Everand
A Guide to Python Mastery: Python
Ummed Singh
No ratings yet
Prompt Engineering Unleashed: Crafting the Future of AI Communication
From Everand
Prompt Engineering Unleashed: Crafting the Future of AI Communication
Michael Ferguson
No ratings yet
Artificial Intelligence Systems Integration: Fundamentals and Applications
From Everand
Artificial Intelligence Systems Integration: Fundamentals and Applications
Fouad Sabry
No ratings yet
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet

LLMs On The Fly: Text-To-JSON For Custom API Calling

Uploaded by

LLMs On The Fly: Text-To-JSON For Custom API Calling

Uploaded by

LLMs on the Fly: Text-to-JSON for Custom API Calling

Miguel Escarda-Fernández1 , Iñigo López-Riobóo-Botana1 , Santiago Barro-Tojeiro1 ,

1. Introduction over, production-ready systems using LLMs require less

Instruction: Your task is to generate in Spanish 3

Inference Quantization Finetuning

{ Json Schema Instruction: Your task is to generate 3 alternative inputs for a

You might also like