Open Source Python Text Processing Software

Python Text Processing Software

View 92 business solutions

Browse free open source Python Text Processing Software and projects below. Use the toggles on the left to filter open source Python Text Processing Software by OS, license, language, programming language, and project status.

  • Passwordless Authentication and Passwordless Security Icon
    Passwordless Authentication and Passwordless Security

    Identity is everything. Protect it with Duo.

    It’s no secret — passwords can be a real headache, both for the people who use them and the people who manage them. Over time, we’ve created hundreds of passwords, it’s easy to lose track of them and they’re easily compromised. Fortunately, passwordless authentication is becoming a feasible reality for many businesses. Duo can help you get there.
    Get a Free Trial
  • Comprehensive Cybersecurity to Safeguard Your Organization | SOCRadar Icon
    Comprehensive Cybersecurity to Safeguard Your Organization | SOCRadar

    See what hackers already know about your organization – and stop them from getting in.

    Protect your organization from cyber threats with SOCRadar’s cutting-edge threat intelligence. Gain 360° visibility into your digital assets, monitor the dark web, and stay ahead of hackers with real-time insights. Start for free and transform your cybersecurity today.
    Free Trial
  • 1
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 16,912 This Week
    Last Update:
    See Project
  • 2
    Diffuse
    Diffuse is a graphical tool for comparing and merging text files. It can retrieve files for comparison from Bazaar, CVS, Darcs, Git, Mercurial, Monotone, RCS, Subversion, and SVK repositories.
    Leader badge
    Downloads: 303 This Week
    Last Update:
    See Project
  • 3
    Notepad++ Python Script

    Notepad++ Python Script

    A Python Scripting plugin for Notepad++

    A Python Scripting plugin for Notepad++. Complete easy script access to all of the editor's features (including absolutely everything in Scintilla). Configurable menus and toolbar options, assign shortcuts to scripts.
    Leader badge
    Downloads: 237 This Week
    Last Update:
    See Project
  • 4
    Utilities for general- and special-purpose documentation. Includes reStructuredText, the easy to read, easy to use, what-you-see-is-what-you-get plaintext markup language.
    Leader badge
    Downloads: 142 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Leader badge
    Downloads: 80 This Week
    Last Update:
    See Project
  • 6
    meld-installer

    meld-installer

    Meld Installer for Windows

    Bundles Portable Python (with PyGTK) and Meld together in an easy to use installer. This allows you to not have to worry about setting up Python or PyGTK and you can keep Meld's Python separate from other Python installations on your machine. ** NOTE ** Meld 3.11 and later now have official installers, hence this project is no longer supported. You can download the new installer here: https://download.gnome.org/binaries/win32/meld/. You should uninstall the old 1.8 version before upgrading.
    Downloads: 45 This Week
    Last Update:
    See Project
  • 7
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both. Supports word inflection (pluralization and singularization) and lemmatization, as well as spelling correction. Add new models or languages through extensions. Also, it comes with a WordNet integration. If you only intend to use TextBlob’s default models (no model overrides), you can pass the lite argument. This downloads only those corpora needed for basic functionality. TextBlob is also available as a conda package.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set the TIKA_SERVER_JAR environment variable to TIKA_SERVER_JAR="file:////tika-server.jar" which successfully tells python-tika to "download" this file and move it to /tmp/tika-server.jar and run as a background process. This is the only way to run python-tika without internet access. Without this set, the default is to check the tika version and pull latest every time from Apache.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    DrPython is a highly customizable cross-platform ide to aid programming in Python. It was developed with teaching in mind, and has a clean, simple interface. It is written in Python, using wxPython as the gui.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    PyRtfLib is a python library that provides a parser and few translators like rtf to html and to simple text.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11
    Gyrfalcon is a note / thought / task management system. Take your notes and other bits of information and: put the notes in hierarchical trees, tag, search, hyperlink, etc. Gyrfalcon is also designed with a clean interface that avoids modal interactions.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    The converter performs automatically the full process of converting the files of a C project into the equivalent C++ files. Classes are created, var and functions becomes attributes and methods and the changes are propagated into all files.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13

    arCHMage

    A reader and decompiler for files in the CHM format

    arCHMage is a reader and decompiler for files in the CHM format. This is the format used by Microsoft HTML Help, and is also known as Compiled HTML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    This tool converts html to mediawiki markup language.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Tomoe is a handwriting character recognition engine.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    pyfiglet is a full port of the FIGlet specification (http://www.figlet.org/) into pure python. It takes ASCII text and renders it in ASCII art fonts. It can be used on the commandline or as an Object Oriented driver library in your own programs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    LMA KeySwitch

    LMA KeySwitch

    LMA KeySwitch — Your Keyboard’s Best Friend!

    LMA KeySwitch v2.0 (2025) A lightweight, free tool for seamless keyboard language switching Multi-Language Support: Includes Ukrainian, English, French, German, Spanish, Italian, Portuguese, Turkish, Arabic, Japanese, Chinese, Hindi, Russian, and more. New languages can be added easily. Visual Feedback: Displays language flags at the cursor and system tray. Sound Cues: One click for native language, two for English, three for others. Customization: Flexible settings to suit your preferences. No Dependencies: Works out of the box with no bugs or installation hassles. LMA KeySwitch is an open-source utility designed to enhance your keyboard experience by offering instant language switching with visual flag indicators and customizable sound cues. Developed by MIHEY in Ukraine, this tool is fully automated, requiring no extra libraries or complex setup. Released in 2025, it’s free to use with regular updates.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    PyWord is a powerful and flexible text editor written in Python. It aims to be similar to other, existing editors (including emacs), but has several unique features as well.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Word segmentation utility for Thai language written in C
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Alphabetizer

    Alphabetizer

    Take a list of words or sentences and arrange them alphabetically.

    Alphabetizer lets anyone take a list of words or sentences and arranged them in alphabetical order easily. Alphabetizer is a tool that takes a list of words or phrases and arranged them in alphabetical order. This tool is useful for organizing information, creating glossaries, sorting names, or any task where the items in a list need to be in alphabetical order. Overall, Alphabetizer can save time and effort by quickly organizing information and making it easier to read and comprehend.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    Indexmeister

    automatic indexing for large LaTex documents

    Indexmeister reads a variety of formats (.tex, .docx, .epub, and others) and suggests keywords for indexing. The included program Imbrowse provides a semi-automatic interface to rapidly add index tags to multi-file latex documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Turn a PC keyboard into a Musical Instrument! Keyano has the ability to turn your PC into a Musical Keyboard, or select Alphabet mode and it becomes fun and educational for the Kids. Type "A B C" and it says them out loud while it shows letters on screen
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    The Python scripts for the conversion from the Chinese Pinyin transcription(ISO 7098) to International Phonetic Alphabet(IPA), comprised of a core module for developers and a flexible GUI application for the common end-users on Modern Chinese phonetics.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    PyRTF is a pure python module for the efficient creation of RTF documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Pyana is a extension module that allows Python programs to interface with the Apache Software Foundation's Xalan XSLT transformation engine.
    Downloads: 1 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.