0% found this document useful (0 votes)

28 views22 pages

Accessing Data Through The CKAN API

The HDX CKAN API Cookbook provides guidance on accessing data from the Humanitarian Data Exchange (HDX) using the CKAN API, focusing on data discovery and downloading rather than updating. It includes instructions for connecting to the API, retrieving dataset metadata, and performing searches with various filters. The document also offers Python code examples and tips for efficient data handling through the API.

Uploaded by

evangran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views22 pages

Accessing Data Through The CKAN API

Uploaded by

evangran

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

HDX CKAN API Cookbook

Accessing data through the CKAN API

The Humanitarian Data Exchange (HDX) runs on a platform called CKAN. You can use
CKAN’s API to discover, update, and download data from HDX without human
intervention.

This cookbook focuses on data access rather than data updating (if you’re interested in
automated data updating, we have a Python library hdx-python-api that will make the
process much simpler).

We’ll start with some basic ingredients, then move on to some full recipes for data
access. In each case, we’ll show both the direct RESTful API URL, and the Python code
that you can use via the official ckanapi (which also includes command-line utilities for
use in shell scripts).

For a simpler method to get automatic notifications about new or modified datasets
matching any search criteria (e.g. for a specific organization, country, tag, search string,
or combination of those), see Appendix A. Syndication feeds for notifications.

1. Basics
This section introduces the fundamental information you’ll need to know to connect to
the API. You may choose to skip straight to 3. Simple search examples if you prefer to
see some examples first.

1.1. Connecting to the API

The root URL for the CKAN API in HDX is

Page 1 of 22
https://data.humdata.org/api/3/

To use this with the CKAN Python API library, try

from ckanapi import RemoteCKAN

ckan = RemoteCKAN("https://data.humdata.org")

(Note that in Python, you supply just the CKAN domain name, not the full API path.) For
all the Python examples that follow, we’ll assume that you’ve created this ckan object
already.

1.2. Reading a single dataset with package_show

A useful API call is retrieving the JSON representation of a dataset’s metadata. On HDX
we refer to "datasets", the CKAN API uses the term "package" for "dataset". This
example grabs the dataset for Humanitarian Response Plan projects for Nigeria:

https://data.humdata.org/api/3/action/package_show?id=hr
p-projects-nga

or in Python:

package =
ckan.action.package_show(id="hrp-projects-nga")

(In Python, you’ll get the JSON result directly; with the URL, you’ll get the metadata
response under a key called "result".) The string "hrp-projects-nga", is the same stub
that appears at the end of the dataset URL on HDX,
https://data.humdata.org/dataset/hrp-projects-nga.

Exercise: try using the patterns above to read the metadata for other datasets on ckan.

1.3. Anatomy of a dataset

When you try the queries above, you’ll note that the JSON does not contain the actual
data. To get at that, it’s necessary to understand the basic structure of the metadata
returned (there are many other properties, but we’ll stick with these for now):

Page 2 of 22
{
"name": "...",
"title": "...",
"description": "...",
"created": "...",
"last_modified": "...",
"dataset_date": "...",
"dataseries_name": "...",
"groups": [
{ … },
{ … }
],
"organization": { … },
"resources": [
{ … },
{ … }
],
"tags" [
{ … },
{ … }
]
}

name: the dataset stub on HDX, like "hrp-projects-nga"

title: the full human-readable dataset title, like "Humanitarian Response Plan projects
for Nigeria"

description: a longer description of the dataset.

created: the date and time when the dataset was first created on ckan.

last_modified: the date and time when the dataset metadata on HDX was last changed
(not the data itself can change independently, especially if it’s hosted off HDX).

dataset_date: the date range when the data is applicable (could be in the past or future
relative to the creation and last-modified dates).

dataseries_name: (optional) a curated series, or list, to which this dataset belongs.

Page 3 of 22
groups: a list of data structures describing countries or country-like entities associated
with the dataset (in this case, just Nigeria).

organization: a data structure describing the data provider (e.g. OCHA Financial Tracking
Service).

resources: a list of data structures describing the resources (files) inside the dataset.

tags: a list of data structures describing the semantic tags associated with the dataset
(like "who is doing what and where-3w-4w-5w").

(You can learn more about the different data structures and properties at
https://docs.ckan.org/en/api)

1.3.1. Downloading the data

To get at the actual data, you need to go inside the resources list, pick a resource (the
first one is often a good choice), and use the download_url property to download the
actual dataset. A download_url will look something like this (though it could also point
directly to site outside HDX):
https://data.humdata.org/dataset/3527869c-8fe9-4289-9d57-1811e789bf6
0/resource/96b24403-0de4-4652-bb76-f585c04b5e6d/download/admin1-summ
aries-litpop.csv

We’ll talk more about using the different parts of the dataset metadata later. The main
focus of this cookbook will be how to locate datasets on HDX so that your code can
download and process them with minimal human intervention.

2. Searching HDX with package_search

If you are the kind of person who likes to start with basic principles, then go ahead and
read this section now. If you prefer to learn by example, then feel free to skip ahead to
3. Simple search examples, and then come back here later to fill in any gaps.

To find datasets, we’ll use the package_search endpoint at

https://data.humdata.org/api/3/action/package_search

or, in Python

Page 4 of 22
packages = ckan.action.package_search()

(Calling this without parameters returns all the datasets on HDX.)

2.1. Paging through results

CKAN search results are paged: the start parameter gives the starting
position(zero-based), while the rows parameter gives the page size. To get all of the
results, choose a page size (e.g. 100), then keep advancing start by that amount in each
query until you have all of the results. For example, the following query gives the third
page of 100 public packages/datasets on HDX:

https://data.humdata.org/api/3/action/package_search?sta
rt=200&rows=100

or, in Python

packages = ckan.action.package_search(start=200,
rows=100)

Tip: instead of doing your own paging, you can use the ckancrawler Python package,
which feeds search results smoothly into a single iterator like this, doing all the paging
behind the scenes:

from ckancrawler import Crawler

crawler = Crawler("https://data.humdata.org")
for package in crawler.packages():
// do something with the package

2.2. Constructing queries

For this cookbook, we use the q parameter to queries, which corresponds to the search
text used at https://data.humdata.org/search

(There is an alternative parameter named fq that works the same way but doesn't affect
search-result weighting for relevancy. You can ignore it for the sake of the examples in
this cookbook, but you might find it useful if you decide to try more-complex free-text
searches in the future.)

Page 5 of 22
If you’re constructing a URL query directly, then you will have to URL-encode your search
string, so that "displaced people" (for example) becomes "displaced%20people" or
"displaced+people":

https://data.humdata.org/api/3/action/package_search?q=d
isplaced%20people

In Python, the ckanapi will do the escaping for you:

packages = ckan.action.package_search(q="displaced
people")

2.2.1. Query filters

Queries can address specific metadata fields using a syntax like fieldname:query term:

groups: datasets related to this country or country-like entity, using the HDX country
stub, which is usually the ISO3 code in lower case (more information).

organization: datasets from this provider, using the HDX organization stub (more
information).

vocab_Topics: datasets labelled with this thematic tag (more information).

dataseries_name: datasets belonging to this data series (more information).

For example, this URL retrieves datasets provided by FAO (440 of them as of November
2024), using the HDX organization stub "fao":

https://data.humdata.org/api/3/action/package_search?q=o
rganization:fao

Python code:

packages =
ckan.action.package_search(q="organization:fao")

For a complete list of query filters available for HDX, see B.1. Complete list of HDX CKAN
search fields.

Page 6 of 22
2.2.2. Advanced boolean logic
By default, repeated filters imply boolean AND: if you have a query "groups:afg
group:pak" it will include only datasets that apply to Afghanistan and to Pakistan. For
advanced use (beyond what’s needed for the examples in this cookbook), you can use
special Solr filter syntax. For example, the query "group:afg OR group:pak" will
return datasets associated with either or both countries:

https://data.humdata.org/api/3/action/package_search?q=g
roups:afg%20OR%20groups:pak

Python code:

packages = ckan.action.package_search(
q="groups:afg OR groups:pak"
)

2.3. Sorting results

The optional sort parameter allows you to control the order in which you get your search
results. After each of the following, you can add "asc" (ascending order; the default) or
"desc" (descending order):

metadata_created: when the dataset was first created in ckan.

last_modified: the last time the dataset was changed on HDX (will not detect changes to
remote resources, like APIs).

score: relevance to your search query (add desc to get the most-relevant datasets first).

title_case_insensitive: title of the dataset.

pageviews_last_14_days: trending (add desc to get the most-popular ones first).

total_res_downloads: number of data downloads (add desc to get the

most-downloaded ones first).

You can combine these if you wish.

Page 7 of 22
The following URL returns datasets sorted by total downloads in descending order (so
that the most-popular ones appear first); note that the whitespace needs to be
URL-encoded:

https://data.humdata.org/api/3/action/package_search?sor
t=total_res_downloads%20desc

Python code:

packages = ckan.action.package_search(
sort="total_res_download desc"
)

2.4. Format of search results

Search results using a raw URL look like this:

{
"help": "…",
"success": true,
"result": {
count: …,
facets: { … },
expanded: { … },
results: [
{ … },
{ … }
],
}
}
The Python API strips off the top layer, so you see just the following:

{
count: …,
facets: { … },
expanded: { … },
results: [
{ … },
{ … }
],
}

Page 8 of 22
Only two of these fields are essential:

count: the total results available (not just the ones returned from this paged query)

results: a list of packages/dataset metadata objects, as described in 1.3. Anatomy of a

dataset.

3. Simple search examples

This section contains simple examples to illustrate the search principles introduced in 2.
Searching HDX with package_search. Note that we are searching only dataset metadata,
not the data itself. More-advanced examples appear in 4. Complex queries.

3.1. Finding datasets by country/group

As mentioned in 2.2.1. Query filters, HDX represents countries (and country-like entities)
using the CKAN groups parameter. HDX group stubs are mostly identical to ISO3 country
codes, in lower case, so the code for Ukraine is "ukr" (not "UKR" or "UA"). A full list of
HDX countries/groups is available from

https://data.humdata.org/api/3/action/group_list?all_fie
lds=true

or in Python,

countries = ckan.action.group_list(all_fields=True)

Use the name field in your query (that contains the code).

3.1.1. Example: Ukraine

The following query will return public datasets for Ukraine (215 of them as of November
2024):

https://data.humdata.org/api/3/action/package_search?q=g
roups:ukr

Page 9 of 22
or in Python,

packages = ckan.action.package_search(q="groups:ukr")

As described in 2.3. Sorting results, if you want the most-recent datasets first, you can
add a sort parameter:

https://data.humdata.org/api/3/action/package_search?q=g
roups:ukr&sort=metadata_created%20desc

or in Python,

packages = ckan.action.package_search(
q="groups:ukr",
sort="metadata_created desc"
)

3.2. Finding datasets by provider

Data providers in HDX use the organization parameter, introduced in 2.2.1. Query filters.
A complete list of HDX data-provider organizations is available at

https://data.humdata.org/api/3/action/organization_list?
all_fields=true

or in Python,

orgs = ckan.action.organization_list(all_fields=True)

As with countries, use the name field from these results in your queries.

3.2.1. Example: Integrated Food Security Phase Classification

Looking in the link above, the Integrated Food Security Phase Classification (IPC) has the
HDX stub (name) "ipc", so you should use the identifier "ipc" to find datasets provided
by the organisation. Alternatively, you can find the same stub at the end of the URL
organization page https://data.humdata.org/organization/ipc

The following query will return a list of IPC’s datasets (57 of them as of November 2024):

Page 10 of 22
https://data.humdata.org/api/3/action/package_search?fq=
organization:ipc

or in Python,

packages =
ckan.action.package_search(q="organization:ipc")

3.3. Finding datasets by topic tag

HDX uses a special controlled CKAN tag vocabulary named "Topics" for thematic tagging
of datasets. In searches, you query a topic using the vocab_Topics parameter introduced
in 2.2.1. Query filters.

A list of topic tags is available at

https://data.humdata.org/api/3/action/tag_list?vocabular
y_id=Topics&all_fields=true

or in Python,

topics = ckan.action.tag_list(
vocabulary_id="Topics",
all_fields=True
)

As with countries and organizations, use the name field from these results in your
queries.

3.3.1. Example: Gender-based violence

The topic tag "gender-based violence-gbv" will allow us to find HDX datasets related to
gender-based violence:

https://data.humdata.org/api/3/action/package_search?q=v
ocab_Topics:%22gender-based%20violence-gbv%22

WARNING: If the topic tag contains whitespace (as many in HDX unfortunately do), you
will have to both URL-encode the whitespace and surround the tag name in quotation
marks when constructing the URL. This is an easy trap to fall into when working with the
HDX CKAN API.

Page 11 of 22
In Python, the quotation marks are also required (but not, obviously, the URL encoding):

packages = ckan.action.package_search(
q="vocab_Topics:\"gender-based violence-gbv\""
)

3.4. Finding datasets by data series

HDX data series group datasets by source and theme, for example "IOM - DTM Baseline
Assessment". These are an effective way to find datasets automatically when they’re
created. In searches, you query data series using the dataseries_name parameter
introduced in 2.2.1. Query filters.

A list of data series is available at

https://data.humdata.org/api/3/action/package_search?row
s=0&fq=dataset_type:dataset&facet.field=[%22dataseries_n
ame%22]&facet.limit=1000

Use the dataseries_name field from these results in your queries.

3.4.1. Example: IOM DTM Baseline Assessments

As mentioned above, the data-series name "IOM - DTM Baseline Assessment" will allow
you to find HDX datasets containing IOM’s DTM baseline assessments for various
countries (remember to quote the name and URL-encode any whitespace in it):

https://data.humdata.org/api/3/action/package_search?q=d
ataseries_name:%22IOM%20-%20DTM%20Baseline%20Assessment%
22

WARNING: If the data-series name contains whitespace (as many in HDX unfortunately
do), you will have to both URL-encode the whitespace and surround the tag name in
quotation marks when constructing the URL. This is an easy trap to fall into when
working with the HDX CKAN API.

In Python, you don’t need to URL-encode the whitespace, but you still need to quote it:

packages = ckan.action.package_search(

Page 12 of 22
q="dataseries_name:\"IOM - DTM Baseline
Assessment\""
)

3.5. Finding datasets by free-text search

In addition to the special fields described in 2.2.1. Query filters, you can choose to
simply search for a text string that’s likely to appear in a dataset’s title or description.
For example, the following will find all datasets that mention "volunteers" (10 of them as
in December 2024):

https://data.humdata.org/api/3/action/package_search?q=v
olunteers

In Python, you would use

packages = ckan.action.package_search(q="volunteers")

For information on using wildcards in text searches, see B.3. Querying with wildcards
and ranges.

4. Complex queries
The API calls in 3. Simple search examples won’t always be enough. To locate a specific
dataset with confidence, you will need to combine multiple filters, sorting specifications,
and (in some cases) free-text search. The following examples show how your code can
use these together to find a relevant dataset automatically.

4.1. Latest OCHA 3W for Lebanon

Activity reports ("Who? What? Where?," or simply "3W") are a core humanitarian data
type. Many international organisations and humanitarian clusters produce their own
3Ws, but for this example, we want to get the latest consolidated 3W, a cross-sector
dataset which UNOCHA coordinates in countries where it works.

As of December 2024, there is no data series for all OCHA 3Ws, but there is a general
CKAN topic tag "who is doing what and where-3w-4w-5w" (see 3.3. Finding
datasets by topic tag), so this makes a good starting point:

Page 13 of 22
vocab_Topics:"who is doing what and where-3w-4w-5w"

Next, we want to narrow the results down to Lebanon (see 3.1. Finding datasets by
country/group), so we add the filter

group:leb

And we want only 3Ws from OCHA, not from other organisations (see 3.2. Finding
datasets by provider). That’s trickier, because each OCHA field office is a separate
organisation on ckan. For OCHA Lebanon, the HDX organisation is "ocha-lebanon".
So we also add

organization:ocha-lebanon

(In this case, we could have left out the country filter, but in others, one OCHA field
office might produce 3Ws for multiple countries, so it’s usually best to leave it in.)

And finally, we’ll want to sort the results so that the latest 3W appears first (see 2.3.
Sorting results), so we will add the sort parameter

last_modified desc

When we put it all together, we end up with this API call:

https://data.humdata.org/api/3/action/package_search?q=v
ocab_Topics%3A%22who%20is%20doing%20what%20and%20where-3
w-4w-5w%22%20groups:lbn%20organization:ocha-lebanon&sort
=last_modified%20desc

or in Python,

query_parts = (
"vocab_Topics:\"who is doing what and
where-3w-4w-5w\"",
"groups:lbn",
"organization:ocha-lebanon",
)

packages = ckan.action.package_search(
q=" ".join(query_parts),
sort="last_modified desc"
)

Page 14 of 22
The first result should reliably be the latest available OCHA 3W for Lebanon. This will
work even if the dataset name and URL change on HDX.

4.2. Food prices for Venezuela

For this exercise, we want to find the latest food prices for Venezuela. We can find the
data series "WFP - Food Prices" using the method described in 3.4. Finding
datasets by data series, which gets us a good part of the way there.

dataseries_name:"WFP - Food Prices"

We also need to set the group to "ven" for Venezuela (see 3.1. Finding datasets by
country/group):

groups:ven

And finally, once again, we want to set the sort to

last_modified desc

so that the first result will be the most-recent one in case there is more than one result
(see 2.3. Sorting results).

All together, that gives us the following API call:

https://data.humdata.org/api/3/action/package_search?q=d
ataseries_name:%22WFP%20-%20Food%20Prices%22%20groups:ve
n&sort=last_modified%20desc

In Python, we can retrieve the same data like this:

query_parts = (
"dataseries_name:\"WFP - Food Prices\"",
"groups:ven",
)
packages = ckan.action.package_search(
q=" ".join(query_parts),
sort="last_modified desc"
)

Page 15 of 22
4.4. Sex- and age-disaggregated datasets related to refugees
Now, let’s try a more-thematic approach. We want to find all datasets that contain sex-
and age-disaggregated data about refugees. In this case, we want to combine two topic
tags, as introduced in 3.3. Finding datasets by topic tag (note that we have to quote the
first one because of the internal spaces):

vocab_Topics:"sex and age disaggregated data-sadd"

vocab_Topics:refugees

These result in the following API call:

https://data.humdata.org/api/3/action/package_search?q=v
ocab_Topics:%22sex%20and%20age%20disaggregated%20data-sa
dd%22%20vocab_Topics:refugees

In Python, you can use the following code:

query_parts = (
"vocab_Topics:\"sex and age disaggregated
data-sadd\"",
"vocab_Topics:refugees",
)
packages = ckan.action.package_search(
q=" ".join(query_parts)
)

Page 16 of 22
Appendix A. Syndication feeds for notifications
As an alternative to using the CKAN API to find data on HDX, you can use Atom (similar
to RSS) syndication feeds to receive notifications of any new or modified datasets
matching your search criteria. You construct searches the same way as for the CKAN API,
but the URL pattern looks like this (using a simple text search for "food"):

https://data.humdata.org/feeds/dataset.atom?q=food

The result is a list of entries like this (in XML), though libraries for all major programming
languages ensure that you will never have to deal with the XML markup directly:

<entry>
<id>https://data.humdata.org/dataset/efad2587-3c06-
4530-ba12-1c6e8ae393db</id>
<title>Guinea - HungerMap data</title>
<updated>2024-12-10T14:13:31.694584+00:00</updated>
<content>HungerMapLIVE is ...</content>
<link
href="https://data.humdata.org/api/3/action/package_show
?id=efad2587-3c06-4530-ba12-1c6e8ae393db"
rel="alternate"/>
<link
href="https://data.humdata.org/api/3/action/package_show
?id=wfp-hungermap-data-for-gin" rel="enclosure"/>
<category term="food security"/>
<category term="hxl"/>
<category term="indicators"/>

<published>2024-11-25T21:44:13.878433+00:00</published>
</entry>

Note that the entry contains a direct link into the HDX CKAN API to download the
package metadata (see 1.3. Anatomy of a dataset). The first entry will be the
most-recently-modified result, and so on, so you can use this to get automated update
notifications for any of the searches described in the cookbook.

Page 17 of 22
There are also simpler, dedicated URLs to get updates for a country or organization
without using the fielded search syntax. For example, this feed will always return the
most-recently-updated public datasets related to Afghanistan:

https://data.humdata.org/feeds/group/afg.atom

And this feed will always return the latest public datasets provided by the World Food
Programme:

https://data.humdata.org/feeds/organization/wfp.atom

You can consume these feeds programmatically using a library like atoma in Python, or
you can load them into a feed reader like Feedly or NetNewsWire for human
consumption (beside blogs, news articles, and other syndicated information).

Page 18 of 22
Appendix B: Advanced CKAN search features
(Contributed by Ian Hopkinson)

CKAN and HDX provide more-advanced search features that aren’t covered in this
cookbook, but might be useful for specific needs.

B.1. Complete list of HDX CKAN search fields

Here is the complete list of HDX CKAN search fields (as introduced in 2.2.1. Query filters).
Those fields prefixed res_ refer to resource metadata, the rest to dataset metadata.
Those fields marked with a * are date type fields which support some special search
features described below. The approved tags which appear on HDX datasets are actually
stored in a field called "vocab_Topics".

archived batch caveats creator_user_id

data_update_frequency dataseries_name dataset_preview dataset_source

has_geodata has_quickcharts has_showcases id

is_requestdata_type isopen last_modified* license_id

license_title maintainer metadata_created* metadata_modified*

methodology name notes num_of_showcases

num_resources num_tags organization overdue_date*

owner_org package_creator pageviews_last_14_days qa_completed

res_description res_extras_broken_l res_extras_in_hapi res_format

ink

res_name res_url review_date* solr_additions

state subnational title total_res_downloads

type updated_by_script url vocab_Topics

Page 19 of 22

B.2. Querying date fields
CKAN’s date search facilities are powerful but not always obvious. You can do an exact
search for a datetime with a query like
metadata_created:"2019-12-04T10:23:27.806321Z", note that if the trailing Z is omitted
the search fails with an Invalid Date String error, however dates are returned from
package_search without a trailing Z! Date fields can also be queried with a range
expression which allows for the special values NOW, DAY, MONTH, YEAR, HOUR,
MINUTE, these can be combined with "normal" dates with +,- and / operators (/ is
rounding):

Example:

Find datasets modified in the last 24 hours:

https://data.humdata.org/api/action/package_search?q=las
t_modified:[NOW-1DAY%20TO%20NOW]

or in Python,

packages = ckan.action.package_search(
q="last_modified:[NOW-1DAY TO NOW]"
)

B.3. Querying with wildcards and ranges

The multicharacter (*) and single character (?) wildcard operators are supported but
cannot be used in quoted search terms (use a backslash instead to escape whitespace).

Example:

Find all datasets with the source "ETH Zurich Cli?ada" where "?" represents any letter:

https://data.humdata.org/api/action/package_search?q=dat
aset_source:ETH\%20Zurich\%20Cli?ada

or in Python,

Page 20 of 22
packages = ckan.action.package_search(
q="dataset_source:ETH\\ Zurich\\ Cli?ada"
)

Example:

Find all datasets with the source "ETH Zurich Cli*" where "*" represents 0 or more
letters:

https://data.humdata.org/api/action/package_search?q=dat
aset_source:ETH\%20Zurich\%20Cli*

or in Python,

packages = ckan.action.package_search(
q="dataset_source:ETH\\ Zurich\\ Cli*"
)

As well these simple wildcard usages the search API also supports range queries which
can include wildcards.

Example:

Find datasets that have between 2 and 5 resources:

https://data.humdata.org/api/action/package_search?q=num
_resources:[2%20TO%205]

Or in Python,

packages = ckan.action.package_search(
q="num_resources:[2 TO 5]"
)

Range queries can be used to select values which are not null with the query

fieldname:[* TO *]

This is the approved way of doing such selections. The query

Page 21 of 22
fieldname:*

includes datasets that have null values as well as all other values.

Page 22 of 22

Foundation of Data Science
100% (2)
Foundation of Data Science
143 pages
IT Recruiter Training Part
No ratings yet
IT Recruiter Training Part
4 pages
DB2 PureScale Redbook
100% (1)
DB2 PureScale Redbook
306 pages
Year 1 Computer Programming Assessment Brief
No ratings yet
Year 1 Computer Programming Assessment Brief
14 pages
LiteMarker User Guide V2.1 1
No ratings yet
LiteMarker User Guide V2.1 1
47 pages
LO4 - Creating Database Report
100% (1)
LO4 - Creating Database Report
43 pages
Cycle Count1
No ratings yet
Cycle Count1
2 pages
How To Learn To Code
No ratings yet
How To Learn To Code
3 pages
Vendors
No ratings yet
Vendors
266 pages
Information Booklet Big Bang
No ratings yet
Information Booklet Big Bang
8 pages
Yamaha PA Full-Line 2018 Global EN PDF
No ratings yet
Yamaha PA Full-Line 2018 Global EN PDF
239 pages
Shodan Pentesting Guide - TurgenSec Community
No ratings yet
Shodan Pentesting Guide - TurgenSec Community
78 pages
Architecture Roadmap
No ratings yet
Architecture Roadmap
5 pages
Dkan Documentation
No ratings yet
Dkan Documentation
123 pages
Fhir Bulk Data API v10
No ratings yet
Fhir Bulk Data API v10
42 pages
03 IRENA Load Flow Analysis
No ratings yet
03 IRENA Load Flow Analysis
20 pages
MVC - Restful API
No ratings yet
MVC - Restful API
22 pages
02 Database Lecture Databases
No ratings yet
02 Database Lecture Databases
53 pages
Case Study - (Q & R) - DFC10033 - 1 2021 - 2022
No ratings yet
Case Study - (Q & R) - DFC10033 - 1 2021 - 2022
6 pages
Slides GeoprocessWithPythonInArcGIS10 1
No ratings yet
Slides GeoprocessWithPythonInArcGIS10 1
22 pages
PxAPI Description
No ratings yet
PxAPI Description
13 pages
Bimtek Portal Sata PB Jatim
No ratings yet
Bimtek Portal Sata PB Jatim
20 pages
Pci Micro Project On Election System DDNHR
No ratings yet
Pci Micro Project On Election System DDNHR
18 pages
ONB 2.0 Furnish Equipment Integration To External - v2.1
No ratings yet
ONB 2.0 Furnish Equipment Integration To External - v2.1
29 pages
How To Run Open Source LLMs Locally Using Ollama
No ratings yet
How To Run Open Source LLMs Locally Using Ollama
7 pages
Error Details
No ratings yet
Error Details
9 pages
Message
No ratings yet
Message
17 pages
App Store Optimization Tips - 15 ASO Case Studies
No ratings yet
App Store Optimization Tips - 15 ASO Case Studies
16 pages
ADE Accenta G3
No ratings yet
ADE Accenta G3
7 pages
Caesar Voldseth - SM64 Machinima Wiki
No ratings yet
Caesar Voldseth - SM64 Machinima Wiki
9 pages
Cadence Vhdlin
No ratings yet
Cadence Vhdlin
5 pages
Solr Search Reference
No ratings yet
Solr Search Reference
5 pages
Mid Term Exam Questioner
No ratings yet
Mid Term Exam Questioner
4 pages
Public Datasets: Starting Points
No ratings yet
Public Datasets: Starting Points
6 pages
School Forms Checking Report
No ratings yet
School Forms Checking Report
3 pages
Cs411 Assignment Solution 2025
No ratings yet
Cs411 Assignment Solution 2025
3 pages
Mark Wainwright, Open Knowledge Foundation: Using CKAN: Storing Data For Re-Use
No ratings yet
Mark Wainwright, Open Knowledge Foundation: Using CKAN: Storing Data For Re-Use
1 page
Test Hall Ticket 1101 02444 231218 0008: Registration Number
No ratings yet
Test Hall Ticket 1101 02444 231218 0008: Registration Number
1 page
PostgreSQL 16 Cookbook, Second Edition: Solve challenges across scalability, performance optimization, essential commands, cloud provisioning, backup, and recovery
From Everand
PostgreSQL 16 Cookbook, Second Edition: Solve challenges across scalability, performance optimization, essential commands, cloud provisioning, backup, and recovery
Peter G
No ratings yet
Python Data Science Cookbook
From Everand
Python Data Science Cookbook
Taryn Voska
No ratings yet
Python Data Science Cookbook: Practical solutions across fast data cleaning, processing, and machine learning workflows with pandas, NumPy, and scikit-learn
From Everand
Python Data Science Cookbook: Practical solutions across fast data cleaning, processing, and machine learning workflows with pandas, NumPy, and scikit-learn
Taryn Voska
No ratings yet
Essential n8n Playbook
From Everand
Essential n8n Playbook
Leandro Calado
No ratings yet
Elasticsearch for Hadoop
From Everand
Elasticsearch for Hadoop
Shukla Vishal
No ratings yet
Elasticsearch Essentials: Harness the power of ElasticSearch to build and manage scalable search and analytics solutions with this fast-paced guide
From Everand
Elasticsearch Essentials: Harness the power of ElasticSearch to build and manage scalable search and analytics solutions with this fast-paced guide
Bharvi Dixit
No ratings yet
Data Driven Guide for Python Programming : Master Essentials to Advanced Data Structures
From Everand
Data Driven Guide for Python Programming : Master Essentials to Advanced Data Structures
Younes Hamdani
No ratings yet
PostgreSQL 16 Cookbook, Second Edition
From Everand
PostgreSQL 16 Cookbook, Second Edition
Peter G
No ratings yet
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
From Everand
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
Matthew Rosch
No ratings yet
Functional Python Programming
From Everand
Functional Python Programming
Steven Lott
No ratings yet
Mastering DynamoDB
From Everand
Mastering DynamoDB
Tanmay Deshpande
No ratings yet
Parallel Python with Dask: Perform distributed computing, concurrent programming and manage large dataset
From Everand
Parallel Python with Dask: Perform distributed computing, concurrent programming and manage large dataset
Tim Peters
No ratings yet
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
SAS Viya: The Python Perspective
From Everand
SAS Viya: The Python Perspective
Kevin D. Smith
No ratings yet
Parallel Python with Dask
From Everand
Parallel Python with Dask
Tim Peters
No ratings yet
Practical and Efficient SAS Programming: The Insider's Guide
From Everand
Practical and Efficient SAS Programming: The Insider's Guide
Martha Messineo
No ratings yet
Ian Talks JS A-Z: WebDevAtoZ, #1
From Everand
Ian Talks JS A-Z: WebDevAtoZ, #1
Ian Eress
No ratings yet
Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS
From Everand
Unstructured Data Analysis: Entity Resolution and Regular Expressions in SAS
Matthew Windham
No ratings yet
Data Structures I Essentials
From Everand
Data Structures I Essentials
Dennis Smolarski
No ratings yet
Getting Started with SAS Programming: Using SAS Studio in the Cloud
From Everand
Getting Started with SAS Programming: Using SAS Studio in the Cloud
Ron Cody
No ratings yet
Search Algorithm: Fundamentals and Applications
From Everand
Search Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Learning Cascading
From Everand
Learning Cascading
Michael Covert
No ratings yet
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
From Everand
Mastering Node.js Web Development: Go on a comprehensive journey from the fundamentals to advanced web development with Node.js
Adam Freeman
No ratings yet
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
HTML5 Reference: An Alphabetical Guide
From Everand
HTML5 Reference: An Alphabetical Guide
Jo Foster
No ratings yet
10 Lessons in Front-end
From Everand
10 Lessons in Front-end
Krasimir Tsonev
2/5 (1)
Mastering Pandas in Python: Course Book
From Everand
Mastering Pandas in Python: Course Book
Pedro Martins
No ratings yet
NgRx SignalStore: An effortless solution for state management
From Everand
NgRx SignalStore: An effortless solution for state management
Abdelfattah Ragab
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
React Components
From Everand
React Components
Christopher Pitt
No ratings yet
Mastering Apache Cassandra - Second Edition
From Everand
Mastering Apache Cassandra - Second Edition
Nishant Neeraj
No ratings yet
NoSQL Injection for Elasticsearch
From Everand
NoSQL Injection for Elasticsearch
Gary Drocella
No ratings yet
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
Learn Hadoop in 24 Hours
From Everand
Learn Hadoop in 24 Hours
Alex Nordeen
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Learning DHTMLX Suite UI
From Everand
Learning DHTMLX Suite UI
Eli Geske
No ratings yet
Azure For Starters
From Everand
Azure For Starters
Chinmoy Mukherjee
No ratings yet
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
From Everand
The Data Detective's Toolkit: Cutting-Edge Techniques and SAS Macros to Clean, Prepare, and Manage Data
Kim Chantala
No ratings yet
Fast Data Processing with Spark 2 - Third Edition
From Everand
Fast Data Processing with Spark 2 - Third Edition
Krishna Sankar
No ratings yet
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Firebase Storage for Angular: A reliable file upload solution for your applications
From Everand
Firebase Storage for Angular: A reliable file upload solution for your applications
Abdelfattah Ragab
No ratings yet
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
From Everand
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
Manish Soni
No ratings yet
Learn Cassandra in 24 Hours
From Everand
Learn Cassandra in 24 Hours
Alex Nordeen
No ratings yet
Oracle Database Administration Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series
From Everand
Oracle Database Administration Interview Questions You'll Most Likely Be Asked: Job Interview Questions Series
Vibrant Publishers
5/5 (1)
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
Hibernate, Spring & Struts Interview Questions You'll Most Likely Be Asked
From Everand
Hibernate, Spring & Struts Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
AWS Certified Solutions Architect - Professional
From Everand
AWS Certified Solutions Architect - Professional
VB Dev
No ratings yet
Quick Configuration of Openldap and Kerberos in Linux and Authenicating Linux to Active Directory
From Everand
Quick Configuration of Openldap and Kerberos in Linux and Authenicating Linux to Active Directory
Dr. Hidaia Mahmood Alassouli
No ratings yet
HDInsight Essentials - Second Edition
From Everand
HDInsight Essentials - Second Edition
Rajesh Nadipalli
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet

Accessing Data Through The CKAN API

Uploaded by

Accessing Data Through The CKAN API

Uploaded by

HDX CKAN API Cookbook

Accessing data through the CKAN API

1.1. Connecting to the API

To use this with the CKAN Python API library, try

from ckanapi import RemoteCKAN

1.2. Reading a single dataset with package_show

1.3. Anatomy of a dataset

name: the dataset stub on HDX, like "hrp-projects-nga"

description: a longer description of the dataset.

dataseries_name: (optional) a curated series, or list, to which this dataset belongs.

1.3.1. Downloading the data

2. Searching HDX with package_search

To find datasets, we’ll use the package_search endpoint at

(Calling this without parameters returns all the datasets on HDX.)

2.1. Paging through results

from ckancrawler import Crawler

2.2. Constructing queries

In Python, the ckanapi will do the escaping for you:

2.2.1. Query filters

vocab_Topics: datasets labelled with this thematic tag (more information).

dataseries_name: datasets belonging to this data series (more information).

2.3. Sorting results

metadata_created: when the dataset was first created in ckan.

title_case_insensitive: title of the dataset.

pageviews_last_14_days: trending (add desc to get the most-popular ones first).

total_res_downloads: number of data downloads (add desc to get the

You can combine these if you wish.

2.4. Format of search results

results: a list of packages/dataset metadata objects, as described in 1.3. Anatomy of a

3. Simple search examples

3.1. Finding datasets by country/group

3.1.1. Example: Ukraine

3.2. Finding datasets by provider

3.2.1. Example: Integrated Food Security Phase Classification

3.3. Finding datasets by topic tag

A list of topic tags is available at

3.3.1. Example: Gender-based violence

3.4. Finding datasets by data series

A list of data series is available at

Use the dataseries_name field from these results in your queries.​

3.4.1. Example: IOM DTM Baseline Assessments

3.5. Finding datasets by free-text search

In Python, you would use

4.1. Latest OCHA 3W for Lebanon

When we put it all together, we end up with this API call:

4.2. Food prices for Venezuela

dataseries_name:"WFP - Food Prices"

And finally, once again, we want to set the sort to

All together, that gives us the following API call:

In Python, we can retrieve the same data like this:

vocab_Topics:"sex and age disaggregated data-sadd"​

These result in the following API call:

In Python, you can use the following code:

B.1. Complete list of HDX CKAN search fields

archived batch caveats creator_user_id

data_update_frequency dataseries_name dataset_preview dataset_source

has_geodata has_quickcharts has_showcases id

is_requestdata_type isopen last_modified* license_id

license_title maintainer metadata_created* metadata_modified*

methodology name notes num_of_showcases

num_resources num_tags organization overdue_date*

owner_org package_creator pageviews_last_14_days qa_completed

res_description res_extras_broken_l res_extras_in_hapi res_format

res_name res_url review_date* solr_additions

state subnational title total_res_downloads

type updated_by_script url vocab_Topics

Find datasets modified in the last 24 hours:

B.3. Querying with wildcards and ranges ​

Find datasets that have between 2 and 5 resources:

This is the approved way of doing such selections. The query

You might also like

Use the dataseries_name field from these results in your queries.

vocab_Topics:"sex and age disaggregated data-sadd"

B.3. Querying with wildcards and ranges