Open Source Java Natural Language Processing (NLP) Tools

Java Natural Language Processing (NLP) Tools

View 187 business solutions

Browse free open source Java Natural Language Processing (NLP) Tools and projects below. Use the toggles on the left to filter open source Java Natural Language Processing (NLP) Tools by OS, license, language, programming language, and project status.

  • Passwordless Authentication and Passwordless Security Icon
    Passwordless Authentication and Passwordless Security

    Identity is everything. Protect it with Duo.

    It’s no secret — passwords can be a real headache, both for the people who use them and the people who manage them. Over time, we’ve created hundreds of passwords, it’s easy to lose track of them and they’re easily compromised. Fortunately, passwordless authentication is becoming a feasible reality for many businesses. Duo can help you get there.
    Get a Free Trial
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Ansj Chinese word segmentation

    Ansj Chinese word segmentation

    Ansj word segmentation

    The real java implementation of ict. The word segmentation effect is faster than the open source version of ict. Chinese word segmentation, name recognition, part-of-speech tagging, user-defined dictionary. This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM. The word segmentation speed reaches about 2 million words per second (tested under mac air), and the accuracy rate can reach more than 96%. At present, it has realized the functions of Chinese word segmentation, Chinese name recognition, user-defined dictionary, keyword extraction, automatic summarization, and keyword tagging. It can be applied to natural language processing and other aspects, and is suitable for various projects that require high word segmentation effects.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP compon
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    AminePlatform

    AminePlatform

    Amine is a Multi-Layer Platform for the dev. of Intelligent Systems

    Amine is an Artificial Intelligence Multi-Layer Java Open Source Platform dedicated to the development of various kinds of Intelligent Systems and Agents (Knowledge-Based, Ontology-Based, Conceptual Graph -CG- Based, NLP, Reasoning and Learning, Natural Language Processing, etc.). Ontology, KB can be created and manipulated with various processes. CG theory is used as the main knowledge representation language. Amine provides two languages: PROLOG+CG which extends PROLOG with CG and Amine modules, and SYNERGY which is a visual activation/propagation based language. CGs are considered by SYNERGY as activable/executable graphs. See for more detail: //amine-platform.sourceforge.net/
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text, run a series of NLP annotators on the text, and produce a final set of annotations. Pipelines produce CoreDocuments, data objects that contain all of the annotation information, accessible with a simple API, and serializable to a Google Protocol Buffer. CoreNLP generates a variety of linguistic annotations, including parts of speech, named entities, dependency parses, and coreference.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . The actual development and issue tracking can be found here: https://bitbucket.org/cryanfuse/crgrep
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    JWNL is a Java API for accessing the WordNet relational dictionary. WordNet is widely used for developing NLP applications, and a Java API such as JWNL will allow developers to more easily use Java for building NLP applications.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org). Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en. Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.
    Leader badge
    Downloads: 15 This Week
    Last Update:
    See Project
  • 8
    masmt

    masmt

    A frame work for Multi agent system development

    MaSMT is a java based multi-agent system development framework, especially designed for development of English to Sinhala machine translation system. MaSMT also capable to develop any multi-agent based system through its architecture. Reference: B. Hettige, A. S. Karunananda, G. Rzevski, Multi-agent solution for managing complexity in English to Sinhala Machine Translation, International Journal of Design & Nature and Ecodynamics, Volume 11, Issue 2, 2016, 88 – 96. B. Hettige, A. S. Karunananda, G. Rzevski, ” MaSMT: A Multi-agent System Development Framework for English-Sinhala Machine Translation”, International Journal of Computational Linguistics and Natural Language Processing (IJCLNLP), Volume 2 Issue 7 July 2013.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are interested in reuse, and we focus on common NLP tasks that are broadly useful for textmining.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • 10
    MutationFinder is a biomedical natural language processing (NLP) system for extracting mentions of point mutations from free text. MutationFinder achieves high performance (99% precision, 81% recall on blind test data) as an information extraction system
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    Service Grid - Language Grid Base System

    SOA infrastracture initially developed by NICT Language Grid Project

    Service Grid is an infrastructure for accumulating and sharing Web services. Resources with complicated intellectual property issues are wrapped as Web services and shared on the Service Grid. If you release your software by using the software of this project, please include the following description in the documents or on the website. * This software uses the [SOFTWARE] by the Language Grid project (http://langrid.org/). [SOFTWARE] is one of: * Service Grid Server Software (http://langrid.org/oss-project/en/service_grid.html) * Language Service Development Libraries (http://langrid.org/oss-project/en/language_service.html) * Language Grid Toolbox (http://langrid.org/oss-project/en/toolbox.html) If you publish a paper by using the software of this project, please cite the following book. * Toru Ishida Ed. The Language Grid: Service-Oriented Collective Intelligence for Laguage Resource Interoperability. Springer, 2011. ISBN 978-3-642-21177-5.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13

    Stella

    Elementary Arithmetic Word Problem Solver

    Basic natural language processor capable of solving elementary math word problems such as: 1) John has 7 pencils. He buys 10 more pencils. How many pencils does he have in all? 2) There are 6 boys and 4 girls in a library. How many kids are there altogether? 3) Bob ran 8 miles and Stella ran 2 miles. How many miles did they run altogether? 4) What is the sum of 7 and 2? 5) What is 8 divided by 2? 6) What is the 10 subtracted by 2? 7) What is 2 subtracted from 10? Also shows users how answer was achieved. Nominal user education level: 1st grade Educational software. Designed chiefly to aid elementary school students in the study of arithmetic word problems. This software utilizes parts-of-speech tagging software that was developed and published by the Natural Language Processing Group at Stanford University. Many thanks!!! (Full citation in README)
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Grok is a library of natural language processing components, including support for parsing with categorial grammars and various preprocessing tasks such as part-of-speech tagging, sentence detection, and tokenization.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Next Generation Programming

    Next Generation Programming

    Compose Software Without Writing Any Programing Code

    "Next Generation Programming - Programming Without Coding Software" is a drag-drop wizard for creating simple or complex applications without writing any programming language code The Software is coded/designed with "Java Programming Language" for novice/expert programmers; Programmers can write softwares with visual tools : drag-drop components;visual editors... Programmers can use the software to compose of simple/complex applications : Database programs, circuit design, generate code and upload to chip for designed circuits (ESP8266, ESP32 chips) The Software in question is much simpler to use than PWCT (https://sourceforge.net/projects/doublesvsoop/) software. The Software has more features than PWCT software such as SCADA. Please start by looking at examples from the website first. In this way, you can learn the features of the software and how to use the software in a very short time. More Information (Documents, Videos, Examples ...) : negep.epizy.com
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    The Aikernel is an intelligence server and cell runtime environment that uses natural language processing and other pattern matching with Activators, Contexts, Concepts to allow multi tasking between installed cells.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP

    Apache OpenNLP is a machine learning-based NLP library that provides tools for text-processing tasks such as tokenization, sentence segmentation, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AutoSummary uses Natural Language Processing to generate a contextually-relevant synopsis of plain text. It uses statistical and rule-based methods for part-of-speech tagging, word sense disambiguation, sentence deconstruction and semantic analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text. The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the following paper: Ehsan Emadzadeh, Azadeh Nikfarjam, and Graciela Gonzalez. 2011. Double Layered Learning for Biological Event Extraction from Text. In Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task, Portland, Oregon, June. Association for Computational Linguistic
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    BioNLP is an initiative by the University of Colorado Denver Health Sciences Center to create and distribute code, software, and data for applying natural language processing techniques to biomedical texts
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The BioNLP UIMA Component Repository provides UIMA wrappers for novel and well-known 3rd-party NLP tools used in biomedical text prosessing, such as tokenizers, parsers, named entity taggers, and tools for evaluation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CoPT, Corpus Processing Tools, is a set of java classes intended to assist field linguists, NLP researchers and developers, students and software developers in all corpus-related processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Free and open source grammar checker. Currently capable of identifying errors: 1) Incomplete sentence (fragment) 2) Subject Verb Plurality Agreement Incomplete Sentence Example: "John a very man." Subject Verb Plurality Agreement Example: "They walks into a classroom." This software utilizes parts-of-speech tagging software that was developed and published by the Natural Language Processing Group at Stanford University. Many thanks!!! (Full citation in README)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DGiovanni
    A multi-agent architecture for building interactive dramas. It uses the Jason's BDI engine, being the Jason's agent-oriented programming language utilized for performing the drama management and for authoring behaviors for the characters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.