0% found this document useful (0 votes)
64 views6 pages

LLMs On The Fly: Text-To-JSON For Custom API Calling

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views6 pages

LLMs On The Fly: Text-To-JSON For Custom API Calling

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

LLMs on the Fly: Text-to-JSON for Custom API Calling

Miguel Escarda-Fernández1 , Iñigo López-Riobóo-Botana1 , Santiago Barro-Tojeiro1 ,


Lara Padrón-Cousillas1 , Sonia Gonzalez-Vázquez1 , Antonio Carreiro-Alonso1,2 and
Pablo Gómez-Area1,2
1
Centro Tecnológico ITG, Cantón Grande 9, Planta 3, 15003, A Coruña, España
2
FlyThings® Technologies - IoT for Solutions 4.0, Cantón Grande 9, Planta 3, 15003, A Coruña, España

Abstract
In the rapidly evolving landscape of Natural Language Processing (NLP), there is a growing demand for agile and intuitive
tools due to the increasing model capabilities, primarily in the field of Large Language Models (LLMs). In recent months, we
have seen great progress in the Natural Language Generation (NLG) landscape, with proliferation of generative AI applications
leveraging LLMs for a vast number of tasks. The power of LLMs resides in their ability to generalize almost any NLP task to
the problem of next token prediction, thus simplifying the traditional NLP pipelines consisting in intensive data labeling and
domain-specific fine-tuning for a single task. Moreover, LLMs are enhanced (1) with external knowledge bases, which improve
their reasoning and domain understanding and (2) with external tools, which improve their ability to perform actions.
We present a novel approach that harnesses the power of LLMs to transform natural language inputs into structured
data representations, facilitating seamless interaction with custom APIs for real-time data visualization. We explore the
integration of Flythings® Technologies API for Internet of Things (IoT) device solutions in the Industry 4.0 domain. This
system demonstration presents a chat-based virtual assistant that allows users to query the status of monitored machines
and devices. The core component of the application is a LLM that serves as a bridge between user queries and machine-
readable JSON objects, which adhere to a predefined schema following the Flythings standard. Our LLM output facilitates the
interaction with the Flythings API, leading to the generation of visualizations that illustrate IoT device status in real time.

Keywords
NLP, LLM, Fine-tuning, agents, assistants, visualization, API tools, IoT, Monitoring, Industry 4.0

1. Introduction over, production-ready systems using LLMs require less


time than the traditional approaches [2], thanks to the
Undoubtedly, Large Language Models (LLMs) are here generalization of almost any NLP task to the problem of
to stay. Their emergence has marked the beginning of Causal Language Modeling (CLM) [3, 4, 5, 6]. By provid-
an era in which natural language can be used to perform ing such text interface, we communicate with machines
any Natural Language Processing (NLP) task following in a natural way, allowing for intuitive interactions.
prompting techniques [1], which normally required large In this context, our system demonstration paper in-
amount of labeled datasets for fine-tuning ad hoc models troduces a novel application that leverages the strengths
for a single task. LLMs, with their remarkable language of LLMs in industrial monitoring and Internet of Things
understanding and generation capabilities, have the po- (IoT) domains. We integrate our fine-tuned LLMs in the
tential to dramatically ease the continuous iteration of Flythings® Technologies platform1 . Our chat-based ap-
common NLP workflows including, but not limited to, plication allows users to submit queries using natural
data labeling, data augmentation or fine-tuning. More- language. Then, these are processed by our model, whose
task is to extract the relevant information from the in-
SEPLN-CEDI-PD 2024: Seminar of the Spanish Society for Natural put query by identifying and mapping the Flythings API
Language Processing: Projects and System Demonstrations, June input fields, generating a well-formed JSON output fol-
19-20, 2024, A Coruña, Spain. lowing a specific schema. This enables us to easily inter-
$ [email protected] (M. Escarda-Fernández); [email protected] act with the Flythings API, sending the corresponding
(I. López-Riobóo-Botana); [email protected] (S. Barro-Tojeiro);
[email protected] (L. Padrón-Cousillas); [email protected]
JSON objects to their services for real-time monitoring
(S. Gonzalez-Vázquez); [email protected] (A. Carreiro-Alonso); and visualization of IoT device details.
[email protected] (P. Gómez-Area) In Section 2, we conduct a comprehensive review of the
€ https://www.linkedin.com/in/%C3%AD%C3%B1igo-luis-l%C3% existing research, establishing a solid foundation and con-
B3pez-riob%C3%B3o-botana-4a43001a2/ (I. López-Riobóo-Botana); text for our work. In Section 3, we present our pipeline
https://www.linkedin.com/in/phd-sonia-gonz%C3%A1lez-v%C3%
A1zquez-38b14a8b/ (S. Gonzalez-Vázquez)
and infrastructure, describing the design details and all
 0000-0002-9080-1535 (M. Escarda-Fernández); the steps involved, including the data augmentation, fine-
0000-0002-7310-0702 (I. López-Riobóo-Botana);
1
0009-0006-2782-2567 (S. Gonzalez-Vázquez) You have a brief description of the FlyThings® Technolo-
© 2024 Copyright for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).
gies services in the Appendix A, check at https://itg.es/en/
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org) monitoring-iot-platform-flythings/ for additional information.

CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
tuning and deployment of the optimized and production- 3. Proposed Method
ready LLM. In Section 4, we illustrate the practical ex-
amples carried out and the real world utility of our tool, In this section, we present our methodology, covering
presenting its limitations in Section 5. We conclude with all the steps involved in our pipeline. We describe our
Section 6 by summarizing our findings and outlining the data preparation stage, including the seed data creation
future directions of our research. and data augmentation process. We also formulate our
supervised fine-tuning (SFT) method for our information
extraction task, as well as the inference optimizations
2. Related Work taken into account for our LLM deployment. The overall
process is depicted in Figure 1.
In recent months, we have seen a myriad of LLM re-
search papers addressing the topic of context-aware
LLMs through in-context learning. This capability en- 3.1. Seed Data
ables them to generalize to almost any NLP task, com- In the absence of pre-existing user data for our task, de-
monly unseen during pre-training and fine-tuning stages pendent on the FlyThings® technology, we started creat-
[3, 5, 6]. This direction has led the research commu- ing a dataset. We collected feedback from the Flythings
nity to explore the integration of LLMs with external team, who provided us with the initial examples of poten-
tools such as document stores [7] or APIs [8], enhancing tial user inputs and expected outputs. In this way, we got
their generalization capabilities even more. LLM agents a seed dataset consisting of 6 outputs, each of them with 3
[9] are a new concept arised from providing LLMs with different ways of expressing the input in accordance with
(1) extensive up-to-date data pools beyond their fixed the Flythings team. Given these pairs, we agreed on a
knowledge representations and (2) functions or tools to specification, defining a JSON schema as the golden rule.
perform actions and automate processes [10, 11, 12, 13]. Our pipeline starts with (1) a template-based method for
Such two-fold strategy reduces the need for regular re- generating new JSON outputs as described in Figure 1,
training. For example, Gorilla [8] leverages a multitude randomly selecting one of the available options for each
of APIs and documentation through document retrievers, of the JSON fields, following the schema depicted in Fig-
highlighting the effectiveness of this framework. ure 2. In this way, we got a pool of examples for the next
Moreover, the reasoning capabilities of LLMs are in- data augmentation step.
fluenced by the prompt strategies followed [5, 14, 15],
where how natural language instructions are written
significantly affects the performance [16]. More com-
3.2. Data Augmentation
plex prompting strategies like ReAct [9] became popular, Our seed dataset was scarce and limited in scope, lacking
combining reasoning and planning techniques by adding from input query diversity. Therefore, we followed a data
reasoning traces and task-specific actions to the prompt. augmentation approach. We created a custom pipeline for
These strategies benefit the integration of the LLM with generating alternative input queries, given the reference
external sources. In this new landscape, new benchmark (input, output) pairs from the seed data. For this task,
frameworks were proposed [17, 18], which aim at design- we leveraged the Mixture of Experts (MoE) LLM Mixtral-
ing reliable and robust evaluation methodologies. 8x7B-Instruct-v0.1 model from Mistral AI [24].
The introduction of Generative Information Extraction We aimed at generating variant inputs for each JSON
(GIE) has further boosted the NLP field [19]. Recent stud- output from the previous pool depicted in Figure 1, so that
ies [20] propose LLMs to generate structured information we could increase the available (input, output) pairs. We
from natural language. Some closely-related tasks, like used the original seed as reference within the instruction
text-to-SQL [21, 22], involve the transformation of nat- illustrated in Figure 3, generating 3 variations of the input
ural language into SQL language for querying external for each target through few-shot in-context learning [6].
tools (i.e., databases). This generative approach proves to This process corresponds to the (2) data augmentation
be effective even in scenarios involving complex schemas step depicted in Figure 1. We increased our dataset up to
with millions of entities involved [23]. The ability of 355 curated samples for the following SFT stage.
LLMs to manage these large schemas without dropping
performance (effectively generating the target query fol-
3.3. Supervised Fine-Tuning
lowing a specific format) is particularly significant for
our research. We propose a generation step aiming at Before diving into the details of the fine-tuning process,
transforming natural language queries (sent to our virtual it is important to understand why supervised fine-tuning
assistant) into structured JSON objects with the relevant was necessary in the first place. While zero-shot or few-
parameters for the integration of the FlyThings® API. shot (i.e., in-context) learning [25] can be effective for
general NLP tasks, it entails challenges when the task
1 Instruction Input-Output
1 Json schema (1) Output Pool (2)
Task Pool

Instruction: Your task is to generate in Spanish 3


alternative inputs for a specific JSON output (...)
Output LLM This is the output schema: {json_schema}
generator

JSON
(5) (4) (3)
AWQ Supervised

Inference Quantization Finetuning

Figure 1: Our pipeline begins with the design of the JSON schema with the formatting rules, used as the specification for (1)
a template-based method for the generation of random JSON output targets for our task. These outputs are fed into (2) a data
augmentation phase utilizing a LLM to generate multiple inputs corresponding to each previously generated JSON output,
so that we add diversity to how users convey queries. Subsequently, (3) the supervised fine-tuning task for our information
extraction task, (4) the quantization stage for the model inference optimization and (5) the deployment phase culminating
with the integration of the FlyThings® endpoint for the creation of a virtual assistant enhanced with visualizations.

{ Json Schema Instruction: Your task is to generate 3 alternative inputs for a


"series":[ specific JSON output. {rules_to_follow}
{ This is the output schema:
"property": String, {"series": [{ "property": "tap 2", "foi": "greehouse water",
"foi": String, "asIncremental": True }], "visualization": {"config": {"type":
"module": String, "chart", "subtype": "line"}, "body":{ "temporalScale": "DAILY",
"temporalScaleType": "CHANGES" }}}
"asIncremental": Boolean
}
], Input1: View the accumulated status changes for tap 2 of the
greenhouse water device on a daily graph.
"visualization":{
Input2: Observe the daily graph that displays the collective
"config":{
status alterations of tap 2 in the greenhouse watering device.
"type": Enum, Input3: Examine the daily chart showing the aggregate
"subtype": Enum
changes in the status of greenhouse water device's tap 2.
},
"body":{
"period": Enum, Figure 3: The few-shot data augmentation task. We designed
(...) the following prompt: (1) the system instruction (displayed
(...)
"temporalScaleType": Enum in black), including the rules (in bold curly brackets) with the
} seed pairs as reference guiding the generation with few-shot
} examples (omitted for clarity). Then, we present (2) one output
}
from the pool as the target (highlighted in blue) and (3) we
Figure 2: Overview of the JSON schema used for output vali- generate three new input queries (highlighted in green).
dation. Notice that, according to this specification, each JSON
output will have two main parts: (1) the series field, which
includes information about the specific Flythings IoT devices
with some examples of the task in the initial instruction,
been queried and (2) the visualization properties, which in-
was limited and biased by the quality and expressiveness
clude the required information for the visual representation
of the series data in the virtual assistant.
of the provided sequences at inference time. In short,
these two methods neither captured the complexity nor
the specificity of our domain, leading us to sub-optimal
performance in terms of both accuracy and reliability.
is very specific and requires a thorough generation pro- Recognized these limitations, we transitioned to a fine-
cess, limiting hallucinations [26]. In our case, we faced tuning approach to tailor the model for our specific needs.
some issues with the in-context learning approach for During the fine-tuning stage, we assessed multiple mod-
classifying and extracting the corresponding fields for els up to 7 billion parameters, considering the trade-
the Flythings® task. On the one hand, (1) zero-shot learn- off between the model performance and our hardware
ing, which involves making direct predictions without limitations. We finally chose the instruction fine-tuned
any previous examples in the training distribution, had model teknium/OpenHermes-2.5-Mistral-7B2 based on the
problems with detailed input queries requiring complex mistralai/Mistral-7B-Instruct-v0.1 model3 . We leveraged
JSON outputs, in which the corresponding JSON schema the dataset from our previous data augmentation step
in the instruction was not enough. These led to classifi-
cation inaccuracies in the generation step. Similarly, (2) 2
https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B
few-shot learning, which relies on providing the model 3
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
from Section 3.2, following the QLoRA [27] approach the visual widget is loaded, showing the results to the
for efficient fine-tuning. Similar to LoRA (Low-Rank user. We include an example in Figure 4. We also provide
Adaptation of large language models) [28], which freezes a video demonstration6 of the virtual assistant.
the pre-trained model weights and adds trainable rank
decomposition matrices to each transformer block (elim-
inating the need for full fine-tuning), QLoRA goes a step 5. Limitations
further by quantizing the weights of the frozen backbone
In this paper we introduce the first version of the sys-
LLM, adding the LoRA adapters with paged optimizers to
tem as a proof-of-concept demo, still in its early stage
manage memory spikes. This results in a more efficient
of development. We focused on the data augmentation,
memory management for fine-tuning [27].
fine-tuning and deployment stages mainly due to time
constraints. We did not perform thorough evaluation
3.4. Inference Optimization and we acknowledge the importance of this process, but
After the supervised fine-tuning stage of our model, we since the project is linked to a new market product by the
had to determine the inference requirements under a Flythings® company, we aligned with the team require-
production environment, considering (1) our hardware ments, which were more oriented to fast prototyping for
limitations and (2) the need for low latency supporting a first usable version of the chat interface.
real-time queries. In this way, we explored the available
options for reducing the computational requirements, 6. Conclusions and Future Work
while maintaining (or minimally decreasing) the LLM
performance. We opted for the vLLM [29] library, specif- In this paper we present a novel approach for query-
ically designed for fast and efficient serving of LLMs in- ing the Flythings® framework. We described the system
cluding, but not limited to, paged attention optimizations, architecture and the NLP pipeline for the dataset prepara-
continuous batching of incoming requests and optimized tion, LLM fine-tuning and inference optimization stages.
CUDA kernels. We compared the performance of differ- Our approach is generalizable to any text-to-JSON or text-
ent quantization techniques supported by vLLM, such to-API task following the proposed pipeline. We handle
as GPTQ [30] and AWQ [31]. We chose AWQ because it user queries in natural language with a virtual assistant,
offered the best throughput while maintaining the perfor- considering visual feedback. Our next steps include re-
mance4 . We deployed our LLM service in the proprietary fining the fine-tuned LLM using preference data from
ITG clusters, using a RTX A6000 48 GB GDDR6 GPU. users interacting with the system. We will study in more
detail both the helpfulness and the accuracy of our model
outputs by means of thorough evaluation and benchmark-
4. Chatbot Experimentation ing. We plan to explore Reinforcement Learning from
For our experimentation, we implemented a new vir- Human Feedback (RLHF) [32] and Directed Preference
tual assistant view in the FlyThings® framework. The Optimization (DPO) [33] for further alignment with hu-
front-end of the chatbot is in charge of loading the user man preferences. We also foresee future applications of
contexts, which is the list of their IoT devices available. Virtual Reality (VR), which would improve usability un-
With the environment all set, each input query is sent der real conditions and enhance user experience. We aim
to the LLM service, which generates the corresponding to broaden the current functionality beyond ®
querying IoT
JSON output following the schema described in Figure 2. devices, adding more complex Flythings IoT operations,
We identify the closest IoT device information matching such as managing device actions, alerts or dashboards.
the extracted device and property (and optionally module,
if present) JSON fields. Then, we follow these steps: (1) if Acknowledgments
there are no matches, the user is prompted to try again;
(2) if there is exclusively one match, the next step is exe- This ongoing R&D project is supported by the CEL.IA
cuted; (3) if there are more than one match, a radio button network initiative7 through the CDTI (Centro para el De-
is displayed for the user to choose among them. Depend- sarrollo Tecnológico Industrial) (grant CER-20211022) by
ing on the visualization format (graph, table, indicator the Ministerio de Ciencia e Innovación. This research is
and so on), a request to the observation API endpoints5 is also possible thanks to the ITG-Flythings collaboration.
processed, including all the chart configuration. Finally, We would like to express our gratitude to the Flythings
4
The AWQ quantization method consistently outperforms
GPTQ across different model scales in their evaluation benchmark.
Check the original work for more details.
5 6
https://deviot.flythings.io/api/apidocs/index.html# Demo (video) available at https://youtu.be/qHs47rcmpHU
7
api-03-Request_Observations https://itg.es/cervera-celia/
Figure 4: An end-to-end example of the FlyThings® virtual assistant, integrating the LLM service with the API services.

developers team, for their continuous support and feed- Large Language Model Connected with Massive
back to enhance our LLM generation capabilities and APIs, arXiv preprint arXiv:2305.15334 (2023).
integration within their systems. [9] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran,
K. Narasimhan, Y. Cao, React: Synergizing reason-
ing and acting in language models, arXiv preprint
References arXiv:2210.03629 (2022).
[10] A. Parisi, Y. Zhao, N. Fiedel, TALM: Tool Aug-
[1] S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke,
mented Language Models, ArXiv abs/2205.12255
E. Horvitz, E. Kamar, P. Lee, Y. T. Lee, Y. Li, S. Lund-
(2022). URL: https://api.semanticscholar.org/
berg, et al., Sparks of artificial general intelli-
CorpusID:249017698.
gence: Early experiments with gpt-4, arXiv preprint
[11] T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu,
arXiv:2303.12712 (2023).
M. Lomeli, L. Zettlemoyer, N. Cancedda, T. Scialom,
[2] A. Kulkarni, A. Shivananda, A. Kulkarni, D. Gu-
Toolformer: Language models can teach themselves
divada, LLMs for Enterprise and LLMOps, Apress,
to use tools, arXiv preprint arXiv:2302.04761 (2023).
Berkeley, CA, 2023, pp. 117–154. URL: https://doi.
[12] R. Nakano, J. Hilton, S. Balaji, J. Wu, L. Ouyang,
org/10.1007/978-1-4842-9994-4_7. doi:10.1007/
C. Kim, C. Hesse, S. Jain, V. Kosaraju, W. Saunders,
978-1-4842-9994-4_7.
X. Jiang, K. Cobbe, T. Eloundou, G. Krueger, K. But-
[3] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei,
ton, M. Knight, B. Chess, J. Schulman, WebGPT:
I. Sutskever, Language Models are Unsuper-
Browser-assisted question-answering with human
vised Multitask Learners, 2019. URL: https://api.
feedback, CoRR abs/2112.09332 (2021). URL: https:
semanticscholar.org/CorpusID:160025533.
//arxiv.org/abs/2112.09332. arXiv:2112.09332.
[4] J. Wei, M. Bosma, V. Zhao, K. Guu, A. W. Yu,
[13] S. Yao, R. Rao, M. Hausknecht, K. Narasimhan,
B. Lester, N. Du, A. M. Dai, Q. V. Le, Fine-
Keep CALM and explore: Language models for
tuned Language Models Are Zero-Shot Learn-
action generation in text-based games, in:
ers, ArXiv abs/2109.01652 (2021). URL: https://api.
B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Pro-
semanticscholar.org/CorpusID:237416585.
ceedings of the 2020 Conference on Empiri-
[5] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, Y. Iwa-
cal Methods in Natural Language Processing
sawa, Large Language Models are Zero-Shot Rea-
(EMNLP), Association for Computational Linguis-
soners, ArXiv abs/2205.11916 (2022). URL: https:
tics, Online, 2020, pp. 8736–8754. URL: https:
//api.semanticscholar.org/CorpusID:249017743.
//aclanthology.org/2020.emnlp-main.704. doi:10.
[6] D. Dai, Y. Sun, L. Dong, Y. Hao, S. Ma, Z. Sui, F. Wei,
18653/v1/2020.emnlp-main.704.
Why Can GPT Learn In-Context? Language Mod-
[14] J. Wei, X. Wang, D. Schuurmans, M. Bosma, E. H.
els Implicitly Perform Gradient Descent as Meta-
hsin Chi, F. Xia, Q. Le, D. Zhou, Chain of Thought
Optimizers (2023). arXiv:2212.10559.
Prompting Elicits Reasoning in Large Language
[7] P. Lewis, E. Perez, A. Piktus, F. Petroni,
Models, ArXiv abs/2201.11903 (2022). URL: https:
V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t.
//api.semanticscholar.org/CorpusID:246411621.
Yih, T. Rocktäschel, et al., Retrieval-augmented
[15] W. Huang, P. Abbeel, D. Pathak, I. Mordatch, Lan-
generation for knowledge-intensive nlp tasks,
guage Models as Zero-Shot Planners: Extracting
Advances in Neural Information Processing
Actionable Knowledge for Embodied Agents, CoRR
Systems 33 (2020) 9459–9474.
abs/2201.07207 (2022). URL: https://arxiv.org/abs/
[8] S. G. Patil, T. Zhang, X. Wang, J. E. Gonzalez, Gorilla:
2201.07207. arXiv:2201.07207. A Survey on Hallucination in Large Language Mod-
[16] Anthropic, Long context prompting for claude els: Principles, Taxonomy, Challenges, and Open
2.1, 2023. URL: https://www.anthropic.com/news/ Questions, ArXiv abs/2311.05232 (2023). URL: https:
claude-2-1-prompting. //api.semanticscholar.org/CorpusID:265067168.
[17] Q. Xu, F. Hong, B. Li, C. Hu, Z. Chen, J. Zhang, [27] T. Dettmers, A. Pagnoni, A. Holtzman, L. Zettle-
On the Tool Manipulation Capability of Open- moyer, QLoRA: Efficient Finetuning of Quantized
source Large Language Models, arXiv preprint LLMs, ArXiv abs/2305.14314 (2023). URL: https:
arXiv:2305.16504 (2023). //api.semanticscholar.org/CorpusID:258841328.
[18] Y. Qin, S. Liang, Y. Ye, K. Zhu, L. Yan, Y. Lu, Y. Lin, [28] E. J. Hu, yelong shen, P. Wallis, Z. Allen-Zhu, Y. Li,
X. Cong, X. Tang, B. Qian, et al., Toolllm: Facilitat- S. Wang, L. Wang, W. Chen, LoRA: Low-Rank
ing large language models to master 16000+ real- Adaptation of Large Language Models, in: In-
world apis, arXiv preprint arXiv:2307.16789 (2023). ternational Conference on Learning Representa-
[19] D. Xu, W. Chen, W. Peng, C. Zhang, T. Xu, X. Zhao, tions, 2022. URL: https://openreview.net/forum?id=
X. Wu, Y. Zheng, E. Chen, Large Language Mod- nZeVKeeFYf9.
els for Generative Information Extraction: A Sur- [29] W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H.
vey, ArXiv abs/2312.17617 (2023). URL: https://api. Yu, J. E. Gonzalez, H. Zhang, I. Stoica, Efficient
semanticscholar.org/CorpusID:266690657. Memory Management for Large Language Model
[20] A. Dunn, J. Dagdelen, N. Walker, S. Lee, A. S. Serving with PagedAttention, in: Proceedings of
Rosen, G. Ceder, K. Persson, A. Jain, Structured the ACM SIGOPS 29th Symposium on Operating
information extraction from complex scientific Systems Principles, 2023.
text with fine-tuned large language models, 2022. [30] E. Frantar, S. Ashkboos, T. Hoefler, D. Alistarh,
arXiv:2212.05238. GPTQ: Accurate Post-training Compression for
[21] J. Li, B. Hui, G. Qu, J. Yang, B. Li, B. Li, B. Wang, Generative Pretrained Transformers, arXiv preprint
B. Qin, R. Cao, R. Geng, N. Huo, X. Zhou, C. Ma, arXiv:2210.17323 (2022).
G. Li, K. C. C. Chang, F. Huang, R. Cheng, Y. Li, [31] J. Lin, J. Tang, H. Tang, S. Yang, X. Dang, S. Han,
Can LLM Already Serve as A Database Interface? AWQ: Activation-aware Weight Quantization for
A BIg Bench for Large-Scale Database Grounded LLM Compression and Acceleration, arXiv (2023).
Text-to-SQLs, 2023. arXiv:2305.03111. [32] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wain-
[22] R. Srivastava, Defog SQLCoder, 2023. URL: https: wright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama,
//github.com/defog-ai/sqlcoder. A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller,
[23] M. Josifoski, N. De Cao, M. Peyrard, F. Petroni, M. Simens, A. Askell, P. Welinder, P. Christiano,
R. West, GenIE: Generative information extraction, J. Leike, R. Lowe, Training language models to
in: M. Carpuat, M.-C. de Marneffe, I. V. Meza Ruiz follow instructions with human feedback, 2022.
(Eds.), Proceedings of the 2022 Conference of the arXiv:2203.02155.
North American Chapter of the Association for [33] R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D.
Computational Linguistics: Human Language Tech- Manning, C. Finn, Direct Preference Optimization:
nologies, Association for Computational Linguis- Your Language Model is Secretly a Reward Model,
tics, Seattle, United States, 2022, pp. 4626–4643. 2023. arXiv:2305.18290.
URL: https://aclanthology.org/2022.naacl-main.342.
doi:10.18653/v1/2022.naacl-main.342.
[24] Mistral AI, Mixtral of experts, 2023. A. Flythings
https://mistral.ai/news/mixtral-of-experts/
and https://huggingface.co/mistralai/ The FlyThings® platform is an all-in-one tool for IoT
Mixtral-8x7B-Instruct-v0.1. device management for many different productive sec-
[25] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, tors. It is designed for the analysis and forecasting of
J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, data records of IoT devices, considering any of the data
G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, types available at scale. FlyThings® handles a wide va-
G. Krueger, T. Henighan, R. Child, A. Ramesh, riety of sensors, systems and applications for specific
D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, use cases including, but not limited to, smart indus-
E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, tries or intelligent energy. FlyThings® helps in the de-
C. Berner, S. McCandlish, A. Radford, I. Sutskever, cision making process, yielding better results for en-
D. Amodei, Language Models are Few-Shot Learn- terprises, with ad hoc offerings including modular Big
ers, 2020. arXiv:2005.14165. Data as a Service (BDaaS) with standard APIs for data
[26] L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, management and visualization. Check https://itg.es/en/
H. Wang, Q. Chen, W. Peng, X. Feng, B. Qin, T. Liu, monitoring-iot-platform-flythings/ for more details.

You might also like