Neurosymbolic AI - Why, What, and How

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Neurosymbolic AI - Why, What, and How

Amit Sheth Kaushik Roy Manas Gaur


Artificial Intelligence Institute Artificial Intelligence Institute University of Maryland
University of South Carolina University of South Carolina Baltimore County
Columbia, SC, USA Columbia, SC, USA MD, USA
[email protected] [email protected] [email protected]

Abstract—Humans interact with the environment using a to support human intelligence and enable individuals to un-
combination of perception - transforming sensory inputs from derstand and interact with the world around them. Daniel
their environment into symbols, and cognition - mapping symbols
arXiv:2305.00813v1 [cs.AI] 1 May 2023

Kahneman popularized a distinction between the goals and


to knowledge about the environment for supporting abstrac-
tion, reasoning by analogy, and long-term planning. Human functions of System 1 and System 2 [1]. System 1 is crucial for
perception-inspired machine perception, in the context of AI, enabling individuals to make sense of the vast amount of raw
refers to large-scale pattern recognition from raw data using data they encounter in their environment and convert it into
neural networks trained using self-supervised learning objectives meaningful symbols (e.g., words, digits, and colors) that can
such as next-word prediction or object recognition. On the other be used for further cognitive processing. System 2 performs
hand, machine cognition encompasses more complex computa-
tions, such as using knowledge of the environment to guide more conscious and deliberative higher-level functions (e.g.,
reasoning, analogy, and long-term planning. Humans can also reasoning and planning). It uses background knowledge to
control and explain their cognitive functions. This seems to re- position the perception module’s output accurately, enabling
quire the retention of symbolic mappings from perception outputs complex tasks such as analogy, reasoning, and long-term
to knowledge about their environment. For example, humans can planning. Despite having different functions, Systems 1 and
follow and explain the guidelines and safety constraints driving
their decision-making in safety-critical applications such as 2 are interconnected and collaborate to produce the human
healthcare, criminal justice, and autonomous driving. While data- experience. Together, these systems enable people to see,
driven neural network-based AI algorithms effectively model comprehend, and act, following their knowledge of the envi-
machine perception, symbolic knowledge-based AI is better suited ronment. In the past decade, neural network algorithms trained
for modeling machine cognition. This is because symbolic knowl- on enormous volumes of data have demonstrated exceptional
edge structures support explicit representations of mappings from
perception outputs to the knowledge, enabling traceability and machine perception, e.g., high performance on self-supervision
auditing of the AI system’s decisions. Such audit trails are useful tasks such as predicting the next word and recognizing digits.
for enforcing application aspects of safety, such as regulatory Remarkably, training on such simple self-supervision tasks has
compliance and explainability, through tracking the AI system’s led to impressive solutions to challenging problems, including
inputs, outputs, and intermediate steps. This first article in protein folding, efficient matrix multiplication, and solving
the Neurosymbolic AI department introduces and provides an
overview of the rapidly emerging paradigm of Neurosymbolic complex puzzles [2], [3]. However, knowledge enables humans
AI, combining neural networks and knowledge-guided symbolic to engage in cognitive processes beyond what is explicitly
approaches to create more capable and flexible AI systems. These stated in available data. For example, humans make analogical
systems have immense potential to advance both algorithm-level connections between concepts in similar abstract contexts
(e.g., abstraction, analogy, reasoning) and application-level (e.g., through mappings to knowledge structures that spell out such
explainable and safety-constrained decision-making) capabilities
of AI systems. mappings [4]. Perhaps current generative AI systems such
as GPT-4 can acquire the knowledge structures to support
I. W HY N EUROSYMBOLIC AI? cognitive functionality from data alone [5]. The hypothesis
is that next-word prediction from many texts on the Internet
Neurosymbolic AI refers to AI systems that seek to integrate can lead to an emergent ’cognitive model’ of the world that
neural network-based methods with symbolic knowledge- the neural network can use to support cognition. However,
based approaches. We present two perspectives to understand significant concern regarding their black-box nature and the
the need for this combination better: (1) algorithmic-level resulting inscrutability hinders the reliable evaluation of their
considerations, e.g., ability to support abstraction, analogy, cognitive capabilities. On the other hand, though unsuited
and long-term planning. (2) application-level considerations for high-volume data processing, a symbolic model is highly
in AI systems, e.g., enforcing explainability, interpretability, suited for supporting human-like cognition using knowledge
and safety. structures (e.g., knowledge graphs). Thus, rather than depend
Algorithm-Level Considerations on one system or the other, it makes more sense to integrate
Researchers have identified distinct systems in the human the two types of systems: neural network-based Systems 1,
brain that are specialized for processing information related adept at big-data-driven processing, and symbolic knowledge-
to perception and cognition. These systems work together based Systems 2, adept at dealing with knowledge-dependent
cognition. knowledge (lifting) and perform symbolic reasoning. Further-
Application-Level Considerations more, we sub-categorize (1) into methods that utilize (a)
The combination of Systems 1 and 2 in Neurosymbolic compressed knowledge graph representations for integration
AI can enable important application-level features, such as with neural patterns and (b) compressed formal logic-based
explainability, interpretability, safety, and trust in AI. Recent representations for integration with neural patterns. We also
research on explainable AI (XAI) methods that explain neural sub-categorize (2) into methods that employ (a) decoupled
network decisions primarily involves post-hoc techniques like integration between the neural and symbolic components and
saliency maps, feature attribution, and prototype-based expla- (b) intertwined integration between the neural and symbolic
nations. Such explanations are useful for developers but not components. These methods enable both algorithm-level and
easily understood by end-users. Additionally, neural networks application-level functions in varying degrees of effectiveness
can fail due to uncontrollable training-time factors like data spanning low (L), medium (M), and high (H) scales. Figure
artifacts, adversarial attacks, distribution shifts, and system 1 details our categorization of neurosymbolic AI methods.
failures. To ensure rigorous safety standards, it is neces- Algorithm-Level Analysis of Methods in Category 1.
sary to incorporate appropriate background knowledge to set For category 1(a), previous work has used two methods to
guardrails during training rather than as a post-hoc measure. compress knowledge graphs. One approach is to use knowl-
Symbolic knowledge structures can provide an effective mech- edge graph embedding methods, which compress knowledge
anism for imposing domain constraints for safety and explicit graphs by embedding them in high-dimensional real-valued
reasoning traces for explainability. These structures can create vector spaces using techniques such as graph neural networks.
transparent and interpretable systems for end-users, leading to This enables integration with the hidden representations of
more trustworthy and dependable AI systems, especially in the neural network. The other approach is to use knowl-
safety-critical applications [6]. edge graph masking methods, which encode the knowledge
Why Neurosymbolic AI? graphs in a way suitable for integration with the inductive
biases of the neural network. Figure 2 illustrates the two
Embodying intelligent behavior in an AI system must approaches. The ability of neural networks to process large
involve both perception - processing raw data, and volumes of raw data also translates to neural networks used
cognition - using background knowledge to support for knowledge graph compression when processing millions
abstraction, analogy, reasoning, and planning. Sym- and billions of nodes and edges, i.e., large-scale perception
bolic structures represent this background knowledge ((H) in Figure 1). Utilizing the compressed representations
explicitly. While neural networks are a powerful tool in neural reasoning pipelines improves the system’s cognition
for processing and extracting patterns from data, they aspects, i.e., abstraction, analogy, and planning capabilities.
lack explicit representations of background knowl- However, the improvements are modest ((M) in Figure 1)
edge, hindering the reliable evaluation of their cog- due to the lossy compression of the full semantics in the
nition capabilities. Furthermore, applying appropriate knowledge graph (e.g., relationships aren’t modeled effectively
safety standards while providing explainable outcomes in compressed representations). Category 1(b) methods use
guided by concepts from background knowledge is cru- matrix and higher-order tensor factorization methods to ob-
cial for establishing trustworthy models of cognition tain compressed representations of objects and formal logic
for decision support. statements that describe the relationships between them (such
as propositional logic, first-order logic, and second-order sit-
uation calculus), Improvements in cognition aspects follow
II. W HAT IS N EUROSYMBOLIC AI AND H OW DO WE a similar trend as in 1(a). However, compression techniques
ACHIEVE IT ? for formal logic are computationally inefficient and do not
Neurosymbolic AI is a term used to describe techniques facilitate large-scale perception. ((L) in Figure 1).
that aim to merge the knowledge-based symbolic approach Application-Level Analysis of Methods in Category 1.
with neural network methods to improve the overall per- For category 1(a), when compressing the knowledge graph
formance of AI systems. These systems have the ability to for integration into neural processing pipelines, its full seman-
blend the powerful approximation abilities of neural networks tics are no longer explicitly retained. Post-hoc explanation
with the symbolic reasoning capabilities that enable them techniques, such as saliency maps, feature attribution, and
to reason about abstract concepts, extrapolate from limited prototype-based explanations, can only explain the outputs of
data, and generate explainable results [7]. Together, these the neural network. These explanations are primarily meant
components support both algorithm-level and application-level to assist system developers in diagnosing and troubleshooting
concerns introduced in the previous sections. Neurosymbolic algorithmic changes in the neural network’s decision-making
AI methods can be classified under two main categories: process. Unfortunately, they are not framed in domain or appli-
(1) methods that compress structured symbolic knowledge to cation terms and hence have limited value to end-users ((L) for
integrate with neural patterns and reason using the integrated low explainability in Figure 1). Knowledge graph compression
neural patterns and (2) methods that extract information from methods can still be utilized to apply domain constraints, such
neural patterns to allow for mapping to structured symbolic as specifying modifications to pattern correlations in the neural
Compressed 1. Knowledge Graph Embedding Based Methods (e.g., K-Adapter)
Knowledge Graph Representation 2. Knowledge Graph Mask Based Methods (e.g., TDLR)
Symbolic Structured Based
a. …
Knowledge Compression for Methods
Integration with Neural
Patterns Algorithm-level Features Application-level Features
1. Large-scale Perception (H) 2. Abstraction (M) 1. User-Explainability (L) 2. Domain Constraints (M)
1. 3. Analogy (M) 4. Planning (M) 3. Scalability (H) 4. Continual (H)

Compressed 1. Propositional Logic (e.g., KB-ANN)


Integrated Neural Pattern Logic Representation 2. First-Order Logic (e.g., Logical Neural Networks)
Based Methods
Neurosymbolic AI Methods

Based Reasoning …
b.
Algorithm-level Features Application-level Features
1. Large-scale Perception (L) 2. Abstraction (M) 1. User-Explainability (L) 2. Domain Constraints (L)
3. Analogy (M) 4. Planning (M) 3. Scalability (L) 4. Continual (L)

1. Federated Pipeline Methods (e.g., Langchain Pipelines)


Decoupled Integration between Neural 2. Serialized Pipeline Methods (e.g., Wolfram + ChatGPT)
Neural Pattern Lifting for a. and Symbolic Components …
Integration with Symbolic
Structured Knowledge Algorithm-level Features Application-level Features
1. Large-scale Perception (H) 2. Abstraction (M) 1. User-Explainability (M) 2. Domain Constraints (M)
3. Analogy (M) 4. Planning (M) 3. Scalability (H) 4. Continual (L)

2.

Intertwined Integration between Neural 1. Program Abstraction Induction Methods (e.g., Prob. Programs)
Symbolic Structured and Symbolic Components 2. End-to-End Differentiable Methods (e.g., PK-iL)
Knowledge Based …
Reasoning b. Algorithm-level Features Application-level Features
1. Large-scale Perception (H) 2. Abstraction (H) 1. User-Explainability (H) 2. Domain Constraints (H)
3. Analogy (H) 4. Planning (H) 3. Scalability (H) 4. Continual (H)

Fig. 1. The two primary types of neurosymbolic techniques—lowering and lifting—can be further divided into four sub-categories. Across the low (L),
medium (M), and high (H) scales, these methods can be used to provide a variety of functions at both algorithmic and application levels.

throughout the system’s life cycle (such as making continual


Inductive Bias-level Structured Knowledge Compression
modifications to the knowledge graphs). This capability is
Structured Knowledge
Apple Grape Watermelon indicated by the letter (H) in Figure 1. For category 1(b),
Compression Methods Apple
1 1 0
Grape
when compressed formal logic representations are integrated
Background Knowledge

1 1 0
Apple
has
Antioxidants
0 0 1
Watermelon
with neural processing pipelines, system scores tend to be
Grape
Watermelon is_a
Fruit Representation-level Structured Knowledge Compression
low across all application-level aspects of user-explainability,
Apple
domain constraints, scalability, and continual adaptation, as
1. Antioxidants has-1 Apple is_a Fruit Grape
2. Antioxidants has-1 Grape is_a Fruit
3. Watermelon is_a Fruit Watermelon denoted by the letter (L) in Figure 1. This is primarily due
to the effect of a significant user-technology barrier. End-
users must familiarize themselves with the rigor and details
Fig. 2. The figure illustrates two methods for compressing knowledge graphs of formal logic semantics to communicate with the system
to integrate them with neural processing pipelines. One approach involves (e.g., to provide domain constraint specifications).
embedding knowledge graph paths into vector spaces, enabling integration
with the neural network’s hidden representations. The other method involves Algorithm-Level Analysis of Methods in Category 2.
encoding knowledge graphs as masks to modify the neural network’s inductive For category 2(a), the proliferation of large language models
biases. An example of an inductive bias is the correlation information stored and their corresponding plugins has spurred the development
in the self-attention matrices of a transformer neural network [8], [9].
of federated pipeline methods. These methods utilize neural
networks to identify symbolic functions based on task de-
scriptions that are specified using appropriate modalities such
network, as depicted in Figure 2. Nonetheless, this process as natural language and images. Once the symbolic function
has limited constraint specification capabilities, because large is identified, the method transfers the task to the appropriate
neural networks have multiple processing layers and moving symbolic reasoner, such as a math or fact-based search tool.
parts ((M) in Figure 1). It is challenging to determine whether Figure 3 illustrates a federated pipeline method that utilizes the
modifications made to the network are retained throughout the Langchain library. These methods are proficient in supporting
various processing layers. Neural processing pipelines do offer large-scale perception through the large language model ((H)
a high degree of automation, making it easier for a system to in Figure 1). However, their ability to facilitate algorithm-level
scale across various use cases (such as plugging in use case- functions related to cognition, such as abstraction, analogy,
specific knowledge graphs) and to support continual adaptation reasoning, and planning, is restricted by the language model’s
os.environ["SERPAPI_API_KEY"] = REDACTED API KEY
tools = load_tools(["serpapi","llm-math"],llm=llm)
#see list of agent types such as "zero-shot-react-description" in the langchain documentation
federated_agent = initialize_agent(tools,
llm,
agent="zero-shot-react-description") Federated Pipeline Methods
#enter a query
query = """Assuming it takes an hour to prepare for the drive,
how much time should be allotted for the total journey by car from NYC, USA to LA, USA?"""
#run the federated agent
federated_agent.run(query)

The driving time from New York City to Los Angeles is


approximately 41 hours.
Assuming it takes an hour to
If you add the one hour for preparation time,
prepare for the drive, how much then you should allot around 42 hours for the total journey by
time should be allotted for the car from NYC to LA.
total journey by car from NYC,
USA to LA, USA?

The driving time from New York City to Los


LLM-API with Angeles is approximately 41 hours. If you add
Chain-of-Thought the one hour for preparation time, then you
Reasoning should allot around 42 hours for the total
journey by car from NYC to LA.

What is 1 + the number of hours it takes to go from NYC,USA to LA, USA? Wolfram
1 + 41 = 42
Alpha-API

How many hours does it take to drive from NYC, USA to LA, USA? Google Serp-API 41 hours

Fig. 3. Illustrates a federated pipeline method using the Langchain library. The method employs a language model trained on chain-of-thought reasoning
to segment the input query into tasks. The language model then utilizes task-specific symbolic solvers to derive solutions. Specifically, the language model
recognizes that search and scientific computing (mathematics) symbolic solvers are necessary for the given query. The resulting solutions are subsequently
combined and transformed into natural language for presentation to the user.

comprehension of the input query ((M) in Figure 1). Category also limits the constraint modeling capability, which depends
2(b) methods use pipelines similar to those in category 2(a) on the language model’s ability to comprehend application
federated pipelines. However, they possess the added abil- or domain-specific concepts ((M) in Figure 1). Federated
ity to fully govern the learning of all pipeline components pipelines excel in scalability since language models and appli-
through end-to-end differential compositions of functions that cation plugins that facilitate their use for domain-specific use
correspond to each component. This level of control enables cases are becoming more widely available and accessible ((H)
us to attain the necessary levels of cognition on aspects of in Figure 1). Unfortunately, language models require an enor-
abstraction, analogy, and planning that is appropriate for the mous amount of time and space resources to train, and hence
given application ((H) in Figure 1) while still preserving the continual domain adaptation using federated pipelines remains
large-scale perception capabilities. Figure 4 shows an example challenging ((L) in Figure 1). Nonetheless, advancements in
of this method for mental health diagnostic assistance. language modeling architectures that support continual learn-
Application-Level Analysis of Methods in Category 2. ing goals are fast gaining traction. Category 2(b) methods
For the systems belonging to category 2(a), tracing their show significant promise as they score highly regarding all
chain-of-thought during processing immensely enhances the application-level aspects, including user-explainability, domain
application-level aspects of user-explainability. However, the constraints, scalability across use cases, and support for contin-
language model’s ability to parse the input query and relate it ual adaptation to application-specific changes ((H) in Figure
to domain model concepts during response generation limits 1). This is due to the high modeling flexibility and closely
this ability ((M) in Figure 1). Furthermore, the specification of intertwined coupling of system components. Thus, a change
domain constraints in natural language using prompt templates in any particular component leads to positive changes in all
Snomed Concept:
Health Related Map_Function(𝐗 = query, 𝐙 = concept, 𝚹1) If you are struggling with your thoughts and emotions, it
Behavior Finding is recommended that you seek help from a mental
Really struggling with my bisexuality which is causing chaos in
DSM-5 Definition: health professional, such as a therapist or counselor.
my relationship with aObsessive-compulsive
girl. I am equal todisorder
worthlessis a disorder
for her.in which
I’m people They can equip you with the necessary resources and
Snomed Concept:
Intrusive
have obsessive, intrusive thoughts, ideas or sensations that techniques to manage your mental health in a healthy Recorded Expert Agreement
now Thoughts
starting to get drunkmake because I can’t
them feel driven to do cope
somethingwith the
repetitively using Generated Responses
manner. In case you feel like you may harm yourself
obsessive, intrusive thoughts, and need to get out of my immediately, please contact your local emergency
head. services or the nearest emergency room for immediate
assistance. You can also reach out to a suicide
Suicidality Concept:
prevention hotline, such as the National Suicide
70%
Non-specific active suicidal thoughts
Prevention Lifeline at 1-800-273-8255, for guidance and
I have been there and got nothing same as my life. I have a
support.
gun. I’m not on a ledge or something but i have a gun on my
lap.
Continual
Process
47%
Knowledge-
Wish to be Dead infused
Learning Constrained_Response_Generation_Model
True False (PKiL) = 𝚷(𝐘, 𝚹2)

Non-Specific
Suicidal
Active Suicidal
Indication (𝐘)
Thoughts LLMs PkiL
Response Constraints.
True If (𝐘 in {Suicidality: [Ideation, Behavior, Attempt]}),
False
Output: “Please reach out to a mental health User-explainability
Active Suicidal
professional, such as a therapist or counselor, who can i.e., Clinicians and Patients
Ideation with Any
provide you with the tools and resources to cope with +
Methods
your thoughts and emotions healthily. If you are in Domain Constraints
True False immediate danger of harming yourself, please call your Verify adherence to the clinical
local emergency services or the nearest emergency guideline on diagnosis which a
Suicidal Suicidal
Behavior or room for immediate help. You can also contact a suicide clinician understands.
Ideation (𝐘)
Attempt (𝐘) prevention hotline, such as the National Suicide
Prevention Lifeline, at 1-800-273-8255 for support and
guidance.”
𝐘 = Expert_Defined_Domain_Model(𝐗 = query, 𝐙 = concept)

Fig. 4. depicts a pipeline that is fully differentiable from end to end. It consists of a composition of functions corresponding to various pipeline components.
This pipeline enables the development of application-tailored AI systems that can be easily trained end-to-end. To accomplish this, trainable map functions
are applied to raw data, converting it to concepts in the domain model. The example given in the figure relates to mental health diagnosis and conversational
assistance. The map functions link fragments of raw data to decision variables in the diagnosis model, which are then used to apply constraints to the patient’s
response generated by the text generation model. Results from an existing implementation demonstrates that expert satisfaction levels reached 70% using such
a pipeline, compared to 47% with LLMs in federated pipelines, such as OpenAI’s text-Davinci-003 [10].

components within the system’s pipeline. Notably, in an imple- In summary, this article highlights the effectiveness of
mented system for the mental health diagnostic assistance use combining language models and knowledge graphs in current
case, shown in Figure 4, we see drastic improvements in expert implementations. However, it also suggests that future knowl-
satisfaction with the system’s responses, further demonstrating edge graphs have the potential to model heterogeneous types
the immense potential for 2(b) category methods. of application and domain-level knowledge beyond schemas.
This includes workflows, constraint specifications, and process
III. T HE F UTURE OF N EUROSYMBOLIC AI structures, further enhancing the power and usefulness of neu-
In this article, we compared different neurosymbolic ar- rosymbolic architectures. Combining such enhanced knowl-
chitectures, considering their algorithm-level aspects, which edge graphs with high-capacity neural networks would provide
encompass perception and cognition, and application-level the end-user with an extremely high degree of algorithmic
aspects, such as user-explainability, domain constraint spec- and application-level utility. The concern for safety is behind
ification, scalability, and support for continual learning. The the recent push to hold further rollout of generative AI sys-
rapid improvement in language models suggests that they tems such as GPT*, since current systems could significantly
will achieve almost optimal performance levels for large- harm individuals and society without additional guardrails.
scale perception. Knowledge graphs are suitable for symbolic We believe that guidelines, policy, and regulations can be
structures that bridge the cognition and perception aspects encoded via extended forms of knowledge graphs such as
because they support real-world dynamism. Unlike static and shown in Figure 4 (and hence symbolic means), which in turn
brittle symbolic logics, such as first-order logic, they are can provide explainability accountability, rigorous auditing
easy to update. In addition to their suitability for enterprise- capabilities, and safety. Encouragingly, progress is being made
use cases and established standards for portability, knowledge on all these fronts swiftly, and the future looks promising.
graphs are part of a mature ecosystem of algorithms that
enable highly efficient graph management and querying. This ACKNOWLEDGEMENTS
scalability allows for modeling large and complex datasets This work was supported in part by the National Sci-
with millions or billions of nodes. ence Foundation under Grant 2133842, “EAGER: Advancing
Neuro-symbolic AI with Deep Knowledge-infused Learning.”
AUTHORS
Amit Sheth is the founding director of the AI Institute
of South Carolina (AIISC), NCR Chair, and a professor of
Computer Science & Engineering at USC. He received the
2023 IEEE-CS Wallace McDowell award and is a fellow
of IEEE, AAAI, AAIA, AAAS, and ACM. Contact him at:
[email protected]
Kaushik Roy is a Ph.D. student with an active publica-
tion record in the area of this article. Contact him at:
[email protected]
Manas Gaur is an assistant professor at UMBC. His dis-
sertation was on Knowledge-infused Learning, with ongoing
research focused on interpretability, explainability, and safety
of the systems discussed in this article. He is a recipient of
the EPSRC-UKRI Fellowship, Data Science for Social Good
Fellowship, and AI for Social Good Fellowship, and was
recently recognized as 2023 AAAI New Faculty. Contact him
at: [email protected]
R EFERENCES
[1] D. Kahneman, “Thinking, fast and slow,” Farrar, Straus and Giroux,
2011.
[2] J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger,
K. Tunyasuvunakool, R. Bates, A. Žı́dek, A. Potapenko et al., “Highly
accurate protein structure prediction with alphafold,” Nature, vol. 596,
no. 7873, pp. 583–589, 2021.
[3] A. Fawzi, M. Balog, A. Huang, T. Hubert, B. Romera-Paredes,
M. Barekatain, A. Novikov, F. J. R Ruiz, J. Schrittwieser, G. Swirszcz
et al., “Discovering faster matrix multiplication algorithms with rein-
forcement learning,” Nature, vol. 610, no. 7930, pp. 47–53, 2022.
[4] D. Gentner, “Structure-mapping: A theoretical framework for analogy,”
Cognitive science, vol. 7, no. 2, pp. 155–170, 1983.
[5] S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Ka-
mar, P. Lee, Y. T. Lee, Y. Li, S. Lundberg et al., “Sparks of artificial
general intelligence: Early experiments with gpt-4,” arXiv preprint
arXiv:2303.12712, 2023.
[6] A. Sheth, M. Gaur, K. Roy, R. Venkataraman, and V. Khandelwal,
“Process knowledge-infused ai: Toward user-level explainability, inter-
pretability, and safety,” IEEE Internet Computing, vol. 26, no. 5, pp.
76–84, 2022.
[7] A. d. Garcez and L. C. Lamb, “Neurosymbolic ai: The 3 rd wave,”
Artificial Intelligence Review, pp. 1–20, 2023.
[8] V. Rawte, M. Chakraborty, K. Roy, M. Gaur, K. Faldu, P. Kikani,
H. Akbari, and A. P. Sheth, “Tdlr: Top semantic-down syntactic lan-
guage representation,” in NeurIPS’22 Workshop on All Things Attention:
Bridging Different Perspectives on Attention.
[9] R. Wang, D. Tang, N. Duan, Z. Wei, X. Huang, G. Cao, D. Jiang,
M. Zhou et al., “K-adapter: Infusing knowledge into pre-trained models
with adapters,” arXiv preprint arXiv:2002.01808, 2020.
[10] K. Roy, M. Gaur, Q. Zhang, and A. Sheth, “Process knowledge-infused
learning for suicidality assessment on social media,” arXiv preprint
arXiv:2204.12560, 2022.

You might also like