Skip to content
View ruebot's full-sized avatar

Block or report ruebot

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Compilation of BIOSes for various emulation platforms

5,139 529 Updated Aug 13, 2022

Celery Tasks Monitoring Tool

TypeScript 199 20 Updated Dec 5, 2025

Framework for creating Islandora microservices

Go 5 1 Updated Feb 10, 2026

Faster Whisper transcription with CTranslate2

Python 20,890 1,728 Updated Nov 19, 2025

Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media (AAAI 2024)

Python 10 2 Updated Jul 25, 2024

Expose the contents of .docx files without leaving your terminal. Fast, safe, and smart — no Office required!

Makefile 3,555 85 Updated Feb 10, 2026

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Python 28,249 1,908 Updated Dec 29, 2025

A simple CLI for tracking your working time.

Go 786 80 Updated Jun 28, 2024

[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.

Go 130 5 Updated Sep 4, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,732 3,335 Updated Feb 10, 2026

A whirlwind tour of Common Crawl's data using Python

Python 33 6 Updated Feb 2, 2026

Direct File

JavaScript 4,490 1,354 Updated Jun 5, 2025

Parse Apache access logs

Python 30 2 Updated Jan 12, 2026

Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.

Python 252 45 Updated Feb 10, 2026

A web front end for an elastic search cluster

JavaScript 9,503 2,031 Updated Jul 17, 2021

Weighs the soul of incoming HTTP requests to stop AI crawlers

Go 16,814 495 Updated Feb 9, 2026

[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning

Python 1,448 87 Updated Jun 26, 2025

Introduction to WebGraphs - Workshop at the IIPC Web Archiving Conference 2025

Shell 3 Updated Apr 10, 2025

Watch (parts of) webpages and get notified when something changes via e-mail, on your phone or via other means. Highly configurable.

Python 2,987 355 Updated Feb 4, 2026

A list of AI agents and robots to block.

Python 3,614 147 Updated Dec 21, 2025

A polite and user-friendly downloader for Common Crawl data

Rust 67 4 Updated Aug 17, 2025

URL-agnostic WARC dedupe server

Go 11 1 Updated Sep 21, 2025
Java 16 1 Updated Apr 19, 2025

End of Term Web Archive 2024

99 16 Updated Apr 1, 2025

Sampling profiler for Python programs

Rust 14,928 498 Updated Feb 5, 2026

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 20,083 2,139 Updated Feb 10, 2026

The Unofficial TikTok API Wrapper In Python

Python 6,060 1,149 Updated Oct 14, 2025

A simple module to collect video, text, and metadata from Tiktok.

Python 443 56 Updated Oct 4, 2025

A Lit web-component for viewing a Whisper JSON transcript file

JavaScript 14 1 Updated Dec 3, 2024

A python program that turns an LLM, running on Ollama, into an automated researcher, which will with a single query determine focus areas to investigate, do websearches and scrape content from vari…

Python 2,954 274 Updated Dec 14, 2024
Next