GenAI Business Analyzer
GenAI Business Analyzer
Business Ideas
Mike Ali
July 2024
Definitions
https://cset.georgetown.edu/article/what-are-generative-ai-large-language-models-and-foundation-models/
Generative AI
Generative AI is a broad term that can be used for any AI system whose primary function is to generate content. This is in contrast
to AI systems that perform other functions, such as classifying data (e.g., assigning labels to images), grouping data (e.g.,
identifying customer segments with similar purchasing behavior), or choosing actions (e.g., steering an autonomous vehicle).
• Separate technical feasibility from investment feasibility (infeasible tech may still be a good investment and
feasible tech may still be a bad investment)
• Use checklists instead of misleading correlations to assess technical feasibility (ex: “it can pass the bar exam, so
it can think like a lawyer, so it can replace lawyers!”)
• Address the time factor, where it matters and where it does not (ex: Generative AIs will never demonstrate
common sense reasoning because they are not reasoning and have no common sense)
• Capture that roboadvisors based on ChatGPT only benefit investors in OpenAI, not the business that creates
them. Not clear what the barriers to entry are for competitors
• Identify frauds such as Devin and Rabbit R1
Assumptions
• Advisors are always technically feasible because a human is the ultimate decision-maker
• Plans to address technical deficiencies of Generative AIs by adding additional technology do not make the business
idea feasible– the basis of the idea is still Generative AI
Feasibility Decision Generative AI Business Idea Evaluation Tool
Item
Use Case
Complicated reasoning or long-term
planning required Category
Specialty LLM Decision
Deep contextual understanding of
nuanced and possibly ambiguous Autonomous Agent Advisor Item
information required Requires domain
knowledge not in general
High accuracy and reliability is LLM
required
Feasible Technically Unfeasible No Require Must be trained by
Emotional intelligence/empathy Yes
specialty experts in the field of
required Feasible? knowledge
LLM?
Generation of novel and/or original Results require a high
ideas or that require integration of degree of accuracy even
interdisciplinary insights is required with human review
Now Future
Traceability and explainability of Timeframe
decision-making rationale is required
SCORE (# yes’s)
High quality training data is available >0 = Yes
orTimeframe Decision
already ingested
Validate
Prompts contain context technology
Need output as plans or steps roadmap
SCORE (# yes’s)
0 = Feasible
>0 = Unfeasible A B C D E
Timeframe Decision Identify R&D Identify Identify Identify specialty
Validate
Item investment “greater fool” general LLM LLM business
business case
business case business case business case case
Runtime requires advances in
CPU/GPU speed and/or storage
Data exists, but not yet ingested into
an LLM
SCORE (# yes’s) Investment
>0 = Future Decision
Test Cases and Results
Test Case Description Reference Desired Analyzer output
Name
Chat GPT Latest version of the best-known Generative AI https://openai.com/index/chatgpt/ Can be both an autonomous agent and an
4.0 application advisor, so need to run both sides of the
flowchart. Results: A and D
Claude Another ChatGPT-type tool, by Anthropic https://claude.ai/new Can be both an autonomous agent and an
advisor, so need to run both sides of the
flowchart. Results: A and D
BCG Boston Consulting Group (BCG) study that https://www.hbs.edu/ris/Publication%20Files/24-013_d9b45b68-9e74-42d6- Result D. This study showed that ChatGPT
“helpers” demonstrated humans+AI performed better than a1c6-c72fb70c7282.pdf positively impacts work productivity when
humans alone used as an assistant.
Devin ‘Bot that can take Upwork jobs and execute https://www.youtube.com/watch?v=tNmgmwEtoWE Result C, AI equivalent of a human, fails
without supervision. Revealed to be a fraud. multiple parts of the Feasibility checklist
Bar Exam Tools that can pass the bar https://www.livescience.com/technology/artificial-intelligence/gpt-4-didnt- Result A (yes, it can pass but so what?)
ace-the-bar-exam-after-all-mit-research-suggests-it-barely-passed
USMLE Tools that can pass the US Medical Licensing https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.00001 Result A (yes it can pass, but so what?)
Examination (USMLE) 98 [1]
https://www.nature.com/articles/s41598-023-43436-9 [2]
https://journals.lww.com/academicmedicine/fulltext/2024/02000/examining
_chatgpt_performance_on_usmle_sample.24.aspx#:~:text=Applying%20that
%20scoring%20rule%20in,to%20pass%20the%20Step%20examinations [3]
Rabbit R1 Device that creator claimed used an advanced AI https://www.emergetools.com/deep-dives/rabbit-r1 [1] Result C, predicted by failing multiple parts of
“Large Action Module”(LAM) that learns user the Feasibility Decision checklist, especially
habits and websites to perform a wide variety of https://techwireasia.com/2024/01/rabbit-r1-the-future-of-ai-assisted-living- the need for complicated reasoning and
tasks. Revealed to be a fraud starts-here/ [2] contextual understanding
Zoom’s AI “Eric, you have five meetings scheduled today. You https://www.theverge.com/2024/6/3/24168733/zoom-ceo-ai-clones-digital- Result C, predicted by the fact that an AI
substitute do not need to join four of the five. You can send a twins-videoconferencing-decoder-interview equivalent of a human being fails every part of
digital version of yourself.” the Feasibility Checklist
Thrive AI “a customized, hyper-personalized AI health coach” “ it https://time.com/6994739/ai-behavior-change-health-care/ [1] Result C, requires accuracy, traceability, high
Health might recommend a healthy, inexpensive recipe …to https://www.theatlantic.com/technology/archive/2024/07/thrive-ai-health- quality training data, etc. So fails the
replace a fast-food dinner.” huffington-altman-faith/678984/ [2] Feasibility Checklist
Appendix: Technical Decision Checklist References
Feasibility Decision
Item The feasibility checklist was developed
Complicated reasoning or long-term planning required from the sources in this appendix:
Deep contextual understanding of nuanced and possibly ambiguous 1) Asking ChatGPT 3.5 and 4.0 to generate
information required decision tree on the limitations of
High accuracy and reliability is required Generative AI
2) Asking Claude the same question
Emotional intelligence/empathy required
3) MIT working paper on LLM limitations
Generation of novel and/or original ideas or that require integration of
interdisciplinary insights is required
Traceability and explainability of decision-making rationale is required
High quality training data is available or already ingested
Prompts contain context
Need output as plans or steps
SCORE (# yes’s)
0 = Feasible
>0 = Unfeasible
Decision Tree for Determining if a Proposed Application is Within the Capabilities of Does the application require ethical or moral judgment?
Generative AI and LLM Systems (ChatGPT 4.0) Yes: Proceed to question 12.
Does the application require deep conceptual understanding? No: Proceed to question 13.
Yes: Proceed to question 2. Does the application involve resolving complex moral dilemmas or ethical decision making?
No: Proceed to question 3. Yes: Likely exceeds capabilities.
Does the application involve complex reasoning or long-term planning? No: Proceed to question 13.
Yes: Likely exceeds capabilities. Does the application handle sensitive or personal information requiring high privacy and security?
No: Proceed to question 3.
Is high accuracy and reliability crucial for the application? Yes: Proceed to question 14.
No: Proceed to question 15.
Yes: Proceed to question 4. Are there strict privacy regulations or standards that must be adhered to?
No: Proceed to question 5.
Are there significant consequences for errors or misinformation? Yes: Likely exceeds capabilities.
No: Proceed to question 15.
Yes: Likely exceeds capabilities. Is there a need for clear explainability and transparency in the decision-making process?
No: Proceed to question 5.
Does the application require real-time decision making based on dynamic inputs? Yes: Proceed to question 16.
No: Proceed to question 17.
Yes: Proceed to question 6. Is it crucial for the rationale behind decisions to be easily understandable and traceable?
No: Proceed to question 7.
Does the application need to make quick, adaptive decisions in changing environments? Yes: Likely exceeds capabilities.
No: Proceed to question 17.
Yes: Likely exceeds capabilities. Does the application require integration of knowledge from multiple disciplines?
No: Proceed to question 7.
Is the application dependent on understanding and interpreting ambiguous or nuanced Yes: Proceed to question 18.
information? No: Proceed to question 19.
Does the application involve synthesizing interdisciplinary insights coherently?
Yes: Proceed to question 8.
No: Proceed to question 9. Yes: Likely exceeds capabilities.
Does the application require deep emotional intelligence or empathy? No: Proceed to question 19.
Yes: Likely exceeds capabilities. Is the application resource-intensive, requiring significant computational power?
No: Proceed to question 9.
Does the application involve creative, original thought or innovation? Yes: Proceed to question 20.
No: Likely within capabilities.
Yes: Proceed to question 10. Are there limitations in budget or infrastructure for deploying large models?
No: Proceed to question 11.
Does the application require generating entirely novel ideas not derived from existing data? Yes: Likely exceeds capabilities.
No: Likely within capabilities.
Yes: Likely exceeds capabilities. Summary
No: Proceed to question 11. If at any point, the decision tree leads to "Likely exceeds capabilities," the proposed application may require
capabilities beyond current generative AI and LLM systems. If all checks are passed without indicating
Decision Tree Framework
Task Complexity
ChatGPT 3.5
Is the application involved in high-stakes decision-making?
Is the task domain-specific or highly specialized? Yes: Likely exceeds capabilities.
Yes: Likely exceeds capabilities. No: Proceed to next question.
No: Proceed to next question. Consistency and Reliability
Real-time Requirements
Does the application require consistent and reliable outputs?
Does the application require real-time responses? Yes: Assess output stability.
Yes: Assess further for latency concerns. Critical context?: Likely exceeds capabilities.
High stakes?: Likely exceeds capabilities. No: Proceed to next question.
No: Proceed to next question. Long-term Memory Requirements
Data Sensitivity
Is persistent memory over multiple interactions needed?
Does the application handle sensitive or confidential data? Yes: Likely exceeds capabilities.
Yes: Likely exceeds capabilities. No: Proceed to next question.
No: Proceed to next question. Integration with Other Systems
Emotional Intelligence Needs
Does the application need complex integrations with other software or
Does the application require empathetic or emotionally nuanced hardware?
interactions? Yes: Likely exceeds capabilities.
Yes: Likely exceeds capabilities. No: Proceed to next question.
No: Proceed to next question. Ethical Considerations
Interpretation of Ambiguous Input
Are there significant ethical implications or bias concerns?
Will the application deal with vague or ambiguous queries? Yes: Likely exceeds capabilities.
Yes: Assess risk of misinterpretation. No: Application likely within capabilities.
High stakes?: Likely exceeds capabilities.
No: Proceed to next question. Conclusion
Critical Decision-Making If the proposed application answers “Yes” to any of the critical
questions, it may exceed the capabilities of generative AI and LLM
systems. Otherwise, it may be feasible to implement using these
technologies.
Give me a decision tree to determine when a proposed
application is within the capabilities of a generative ai system
Claude
1. Limitations of Reasoning
Contrary to the impression they might create, LLMs are not built
for complex reasoning.