Openverse API

Purpose

The Openverse API (openverse-api) is a system that allows programmatic access to public domain digital media. It is our ambition to index and catalog billions of openly-licensed works, including articles, songs, videos, photographs, paintings, and more. Using this API, developers will be able to access the digital commons in their own applications.

This repository is primarily concerned with back end infrastructure like datastores, servers, and APIs. The pipeline that feeds data into this system can be found in the Openverse Catalog repository. A front end web application that interfaces with the API can be found at the Openverse frontend repository.

API Documentation

In the API documentation, you can find more details about the endpoints with examples on how to use them.

How to Run the Server Locally

Prerequisites

You need to install Docker (with Docker Compose), Git, and PostgreSQL client tools. On Debian, the package is called postgresql-client-common.

Running locally

Run the Docker daemon
Open your command prompt (CMD) or terminal
Clone Openverse API

git clone https://github.com/WordPress/openverse-api.git

Change directories with cd openverse-api
Start Openverse API locally by running the docker containers

docker-compose up

Wait until your CMD or terminal displays that it is starting development server at http://0.0.0.0:8000/
Open up your browser and type localhost:8000 in the search tab
Make sure you see the local API documentation
Open a new CMD or terminal and change directory to openverse-api
Still in the new CMD or terminal, load the sample data. This script requires a local postgres installation to connect to and alter our database.

./load_sample_data.sh

Still in the new CMD or terminal, hit the API with a request

curl localhost:8000/v1/images?q=honey

Make sure you see the following response from the API

Congratulations! You just ran the server locally.

What Happens In the Background

After executing docker-compose up (in Step 5), you will be running:

A Django API server
Two PostgreSQL instances (one simulates the upstream data source, the other serves as the application database)
Elasticsearch
Redis
A thumbnail-generating image proxy
ingestion-server, a service for bulk ingesting and indexing search data.
analytics, a REST API server for collecting search usage data

Diagnosing local Elasticsearch issues

If the API server container failed to start, there's a good chance that Elasticsearch failed to start on your machine. Ensure that you have allocated enough memory to Docker applications, otherwise the container will instantly exit with an error. Also, if the logs mention "insufficient max map count", increase the number of open files allowed on your system. For most Linux machines, you can fix this by adding the following line to /etc/sysctl.conf:

vm.max_map_count=262144

To make this setting take effect, run:

sudo sysctl -p

System Architecture

Basic flow of data

Search data is ingested from upstream sources provided by the data pipeline. As of the time of writing, this includes data from Common Crawl and multiple 3rd party APIs. Once the data has been scraped and cleaned, it is transferred to the upstream database, indicating that it is ready for production use.

Every week, the latest version of the data is automatically bulk copied ("ingested") from the upstream database to the production database by the Ingestion Server. Once the data has been downloaded and indexed inside of the database, the data is indexed in Elasticsearch, at which point the new data can be served up from the Openverse API servers.

Description of subprojects

openverse-api is a Django Rest Framework API server. For a full description of its capabilities, please see the browsable documentation.
ingestion-server is a service for downloading and indexing search data once it has been prepared by the Openverse Catalog
analytics is a Falcon REST API for collecting usage data.

Running the tests

How to Run API live integration tests

You can check the health of a live deployment of the API by running the live integration tests.

Change directory to the openverse-api

cd openverse-api

On the host

Install all dependencies for Openverse API.

pipenv install

Run the tests in a Pipenv subshell.

pipenv run bash ./test/run_test.sh

Inside the container

Ensure that Docker containers are up. See the section above for instructions.

docker-compose ps

Run the tests in an interactive TTY connected to a web container.

docker-compose exec web bash ./test/run_test.sh

How to Run Ingestion Server tests

You can ingest and index some dummy data using the Ingestion Server API.

Change directory to ingestion server

cd ingestion_server

Install all dependencies for Ingestion Server API

pipenv install

Launch a new shell session

pipenv shell

Run the integration tests

python3 test/integration_tests.py

Django Admin

You can view the custom administration views at the /admin/ endpoint.

Contributing

Pull requests are welcome! Feel free to join us on Slack and discuss the project with the engineers and community members on #openverse.

You are welcome to take any open issue in the tracker labeled help wanted or good first issue; there's no need to ask for permission in advance. Other issues are open for contribution as well, but may be less accessible or well-defined in comparison to those that are explicitly labeled.

See the CONTRIBUTING file for details.

Acknowledgments

Openverse, previously known as CC Search, was conceived and built at Creative Commons. We thank them for their commitment to open source and openly licensed content, with particular thanks to original team members @kgodey, @annatuma, @mathemancer, @aldenstpage, @brenoferreira, and @sclachar, along with their community of volunteers.

Name		Name	Last commit message	Last commit date
Latest commit History 2,385 Commits
.github		.github
.idea/dictionaries		.idea/dictionaries
analytics		analytics
ingestion_server		ingestion_server
openverse-api		openverse-api
sample_data		sample_data
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
DOCUMENTATION_GUIDELINES.md		DOCUMENTATION_GUIDELINES.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
initialization.PNG		initialization.PNG
load_sample_data.sh		load_sample_data.sh
local_api_documentation.PNG		local_api_documentation.PNG
localhost_request.PNG		localhost_request.PNG
system_architecture.png		system_architecture.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Openverse API

Purpose

API Documentation

How to Run the Server Locally

Prerequisites

Running locally

What Happens In the Background

Diagnosing local Elasticsearch issues

System Architecture

Basic flow of data

Description of subprojects

Running the tests

How to Run API live integration tests

On the host

Inside the container

How to Run Ingestion Server tests

Django Admin

Contributing

Acknowledgments

About

Releases

Packages

Languages

License

wenxuefeng3930/openverse-api

Folders and files

Latest commit

History

Repository files navigation

Openverse API

Purpose

API Documentation

How to Run the Server Locally

Prerequisites

Running locally

What Happens In the Background

Diagnosing local Elasticsearch issues

System Architecture

Basic flow of data

Description of subprojects

Running the tests

How to Run API live integration tests

On the host

Inside the container

How to Run Ingestion Server tests

Django Admin

Contributing

Acknowledgments

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages