Inspiration

Student teams often waste a lot of time searching for grants and competitions because the information is scattered across many websites and Telegram channels. This creates “information noise” and leads to missed deadlines and lost opportunities. Our goal was to build one clear entry point where students can quickly find relevant funding programs and understand their requirements.

What we learned

We learned that the grant ecosystem is highly fragmented: different organizers publish announcements in different formats, with different rules, deadlines, and reporting requirements. Classic aggregators usually lack personalization, while general-purpose LLM assistants can return incorrect details (“hallucinations”) and do not continuously monitor new offers. This confirmed the need for a specialized system that collects data continuously, structures it, and supports users with reliable answers based on original documents.

How we built the project

We created “Data Point”, an intelligent grant aggregator and personal digital navigator for students.

Core features (MVP):

  • Automated monitoring of open sources (Telegram channels and websites) and extraction of key fields (deadlines, amounts, eligibility, requirements) into a unified structured format.
  • Semantic search & recommendations: user queries in natural language are normalized and categorized (domain, audience, geography), then ranked to show the most suitable grants early in the results list.
  • Q&A over grant documents (RAG): the system answers specific questions using facts retrieved from grant documentation via hybrid search (BM25 + vector search), aiming to ground responses in sources.

Architecture:

  • Microservice approach with a Django backend, React/Vite + TypeScript frontend, PostgreSQL + pgvector for embeddings, REST APIs, and Docker-based containerization.
  • A data pipeline: parser → LLM extraction → optional “snapshot” of full documentation → moderation queue → searchable database.
  • Telegram is used for authentication (Telegram Auth Widget) and planned notifications (via bot).

Challenges we faced

  • Low-quality source text: some Telegram posts lack concrete details, reducing extraction accuracy and requiring softer filtering and better validation rules.
  • Performance under load: advanced language models can slow down responses during peak usage, so we designed caching and a fallback mechanism that returns a standard response when ML services are overloaded or unavailable.
  • Product trade-offs: we prioritized the quality of semantic search and RAG reliability, so the Telegram notification module is not fully finished yet.

Results and next steps

Testing showed stable operation of the data pipeline and good extraction accuracy on official grant regulations. We also implemented metrics collection through web analytics and API logging to validate our hypothesis that centralized semantic search reduces time-to-find relevant grants.

Next steps include:

  • Completing Telegram notifications so users can receive relevant grant digests without constantly visiting the web app.
  • Moving from universal prompts to fine-tuned models on grant/legal/financial language to reduce hallucination risks.
  • Expanding parsing geography and sources (regional university portals, corporate accelerators).

Built With

  • celery/cron-(background-jobs)
  • django-(rest-api)
  • docker-&-docker-compose-(containerization)
  • git
  • hybrid-search-(bm25-+-vector-search)
  • postgresql-17-+-pgvector-(embeddings/vector-search)
  • python
  • rag-pipeline-(llm-based-extraction-+-document-q&a)
  • react
  • telegram-auth-widget-+-telegram-bot-api-(auth-&-notifications)
  • typescript-(spa-frontend)
  • vite
Share this project:

Updates