0% found this document useful (0 votes)
38 views

Part 6 - Data in Docker

This document discusses using data with Docker containers, focusing on volumes. It explains that data in containers can be temporary or persistent. Temporary data is stored in the container's writable layer or via tmpfs mounts. Persistent data can be stored using bind mounts or volumes. Volumes provide more functionality than bind mounts and are the better option for persisting data. The document then covers how to create, inspect, and remove volumes using Docker commands.

Uploaded by

Petar Becic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Part 6 - Data in Docker

This document discusses using data with Docker containers, focusing on volumes. It explains that data in containers can be temporary or persistent. Temporary data is stored in the container's writable layer or via tmpfs mounts. Persistent data can be stored using bind mounts or volumes. Volumes provide more functionality than bind mounts and are the better option for persisting data. The document then covers how to create, inspect, and remove volumes using Docker commands.

Uploaded by

Petar Becic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Sign in Get started

Follow 586K Followers · Editors' Picks Features Deep Dives Grow Contribute About

You have 2 free member-only stories left this month. Sign up for Medium and get an extra one

Pump up the Volumes: Data in


Docker
Part 6 of Learn Enough Docker to be Useful

Jeff Hale Feb 11, 2019 · 6 min read

This article is about using data with Docker. In it, we’ll focus on Docker
volumes. Check out the previous articles in the series if you haven’t yet.
We covered Docker concepts, the ecosystem, Dockerfiles, slimming down
images, and popular commands.

Spices

Pushing the food metaphor running through these articles to the


breaking point, let’s compare data in Docker to spices. Just as there are
many spices in the world, there are many ways to save data with Docker.

Quick FYI: this guide is current for Docker Engine Version 18.09.1 and
API version 1.39.

Data in Docker can either be temporary or persistent. Let’s check out


temporary data first.

Temporary Data
Data can be kept temporarily inside a Docker container in two ways.

By default, files created by an application inside a container are stored in


the writable layer of the container. You don’t have to set anything up.
This is the quick and dirty way. Just save a file and go about your
business. However, when you container ceases to exist, so will your data.

You have another option if you want better performance for saving
temporary data with Docker. If you don’t need your data to persist
beyond the life of the container, a tmpfs mount is a temporary mount that
uses the host’s memory. A tmpfs mount has the benefit of faster read and
write operations.

Many times you will want your data to exist even after the container is
long gone. You need to persist your data.

Persistent Data
There are two ways to persist data beyond the life of the container. One
way is to bind mount a file system to the container. With a bind mount,
processes outside Docker also can modify the data.

From the Docker Docs

Bind mounts are difficult to back up, migrate, or share with other
Containers. Volumes are a better way to persist data.

Volumes
A Volume is a a file system that lives on a host machine outside of any
container. Volumes are created and managed by Docker. Volumes are:

persistent

free-floating filesystems, separate from any one container

sharable with other containers

efficient for input and output

able to be hosted on remote cloud providers

encryptable

nameable

able to have their content pre-populated by a container

handy for testing

That’s a lot of useful functionality! Now let’s look at how you make a
Volume.

Volumes

Creating Volumes
Volumes can be created via a Dockerfile or an API request.

Here’s a Dockerfile instruction that creates a volume at run time:

VOLUME /my_volume

Then, when the container is created, Docker will create the volume with
any data that already exists at the specified location. Note that if you Top highlight

create a volume using a Dockerfile, you still need to declare the


mountpoint for the volume at run time.

You can also create a volume in a Dockerfile using JSON array


formatting. See this earlier article in this series for more on Dockerfiles.

Volumes also can be instantiated at run time from the command line.

Volume CLI Commands


Create
You can create a stand-alone volume with docker volume create —-name

my_volume .

Inspect
List Docker volumes with docker volume ls .

Volumes can be inspected with docker volume inspect my_volume .

Remove
Then you can delete the volume with docker volume rm my_volume .

Dangling volumes are volumes not used by a container. You can remove
all dangling volumes with docker volume prune . Docker will warn you and
ask for confirmation before deletion.

If the volume is associated with any containers, you cannot remove it


until the containers are deleted. Even then, Docker sometimes doesn’t
realize that the containers are gone. If this occurs, you can use docker

system prune to clean up all your Docker resources. Then you should
should be able to delete the volume.

Where your data might be stored

Working with --mount vs. --volume

You will often use flags to refer to your volumes. For example, to create a
volume at the same time you create a container use the following:

docker container run --mount source=my_volume,

target=/container/path/for/volume my_image

In the old days (i.e. pre-2017) 😏the --volume flag was popular.
Originally, the -v or --volume flag was used for standalone containers
and the --mount flag was used with Docker Swarms. However, beginning
with Docker 17.06, you can use --mount in all cases.

The syntax for --mount is a bit more verbose, but it’s preferred over --

volume for several reasons. --mount is the only way you can work with
services or specify volume driver options. It’s also simpler to use.

You’ll see a lot of -v ’s in existing code. Beware that the format for the
options is different for --mount and --volume . You often can’t just replace
a -v in your existing code with a --mount and be done with it.

The biggest difference is that the -v syntax combines all the options
together in one field, while the --mount syntax separates them. Let’s see
--mount in action!

Easy enough to mount

--mount — options are key-value pairs. Each pair is formatted like this:
key=value , with a comma between one pair and the next. Common
options:

type — mount type. Options are bind , volume , or tmpfs . We’re all
about the volume .

source — source of the mount. For named volumes, this is the name
of the volume. For unnamed volumes, this option is omitted. The key
can be shortened to src .

destination — the path where the file or directory is mounted in the


container. The key can be shortened to dst or target .

readonly —mounts the volume as read-only. Optional. Takes no


value.

Here’s an example with lots of options:

docker run --mount


type=volume,source=volume_name,destination=/path/in/container,readonl
y my_image

Volumes are like spices — they make most things better. 🥘

Wrap
Recap of Key Volume Commands
docker volume create

docker volume ls

docker volume inspect

docker volume rm

docker volume prune

Common options for the --mount flag in docker run --mount my_options

my_image :

type=volume

source=volume_name

destination=/path/in/container

readonly

Now that you’ve familiarized yourself with data storage in Docker let’s
look at possible next steps for your Docker journey.

Next steps

Update, I recently published an article on Docker security. Check it out


and learn how to keep your containers safe. 😃
If you haven’t read the articles in this series on Docker concepts, the
Docker ecosystem, Dockerfiles, slim images, and commands, check those
out, too.

If you’re looking for another article on Docker concepts to help cement


your understanding, check out Preethi Kasireddy’s great article here.

If you want to go deeper, check out Nigel Poulton’s book Docker Deep Dive
(make sure to get the most recent version).

If you want to do a lot of building while you learn, check out James
Turnbull’s The Docker Book.

I hope you found this series to be a helpful intro to Docker. If you did,
please share it with others on your favorite forums or social media
channels so your friends can find it, too! 😃
I’ve written some articles on orchestrating containers with Kubernetes
you can read here. I write about articles about Python, data science, AI,
and other tech topics. Check them out follow me if you’re into that stuff.

Thanks for reading! 👏


Thanks to Kathleen Hale. 

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from
hands-on tutorials and cutting-edge research to original features you don't want to
miss. Take a look.

Get this newsletter

Docker Software Development Towards Data Science Software Engineering DevOps

2K 6

More from Towards Data Science Follow

Your home for data science. A Medium publication sharing concepts, ideas and
codes.

Vikram Devatha · Feb 11, 2019

Predicting bankruptcy using Machine


Learning
Vikram Devatha & Devashish Dhiman
The economic meltdown of 2008, initiated a conversation about market
sustainability, and the tools that can be used to predict it. The need for
better predictive models become apparent, in order to avoid such
devastating events in the future. Bankruptcy of companies and
enterprises effects the financial market at multiple fronts, and hence the
need to predict bankruptcy among companies by monitoring multiple
variables takes on an added significance. A better understanding of
bankruptcy and the ability to predict it will impact affect the profitability
of lending institutions worldwide
Being a classification exercise, there are…

Read more · 6 min read

81 1

Share your ideas with millions of readers. Write on Medium

Ties van den Ende · Feb 11, 2019

The importance of context in data sets: A


short experiment
Using four forecasting methods in the same time series to show
performance differences

These days, data scientists are spoilt for choice when it comes to
selecting which method they will use on a database. Academic literature
has moved well beyond the well-known and once widely-used Ordinary
Least Squares method. Contrary to most other fields of research, new
theories and methods are usually implemented into business applications
relatively quickly, and as it stands today, the field of Time Series
Forecasting is arguably the best-researched and most-used one.
This article is not a deep dive into a single method that works best under
certain circumstances. Instead, I would like to show you that no
method…

Read more · 4 min read

103

Jyotirmay Samanta · Feb 11, 2019

AWS Cost Optimization — 5 tricks to help


you reduce AWS costs

One of the most attractive aspects of AWS is its ‘pay as you go’ pricing
model. But, sometimes these AWS costs spiral out of control. We, in this
article, came up with five AWS Cost Optimization Solutions and Tools that
will reduce AWS costs and make sure the spending stays in line with
business’ anticipated budgets.
“No-Cloud” policy will be nearly extinct by the year 2020. Also, by 2019,
more than 30% of the 100 largest vendors’ new software investments
will move to “Cloud-only” from Cloud-first. …

Read more · 4 min read

Andrej Baranovskij · Feb 11, 2019

Jupyter Notebook — Forget CSV, fetch


data from DB with Python
A simple recipe for training data load from Database

Source: Pixabay

If you read a book, article or blog about Machine Learning — high


chances it will use training data from CSV file. Nothing wrong with CSV,
but let’s think if it is really practical. Wouldn’t be better to read data
directly from the DB? Often you can’t feed business data directly into ML
training, it needs pre-processing — changing categorial data, calculating
new data features, etc. Data preparation/transformation step can be
done quite easily with SQL while fetching original business data. …

Read more · 2 min read

114 1

Aisha Javed · Feb 11, 2019

Predicting Workplace Productivity Using


Employees’ Happiness Index
Training Regression Models

You have been observing that since the past few years, happy employees
are the key profit generators of your company and in all these years, you
noted down the happiness index of all your employees and their
productivity. Now you have tons of this employees’ data just lying around
in excel files and you just recently heard “Data is the new oil. The
companies that will win are using math.” — Kevin Plank, founder and
CEO of Under Armour, 2016.
You are wondering if you could also win by somehow mathifying this data
that could predict the productivity of your…

Read more · 9 min read

316 2

Read more from Towards Data Science

More From Medium

Efficiently manage large Hacklang at Slack: A I’m a trendy developer — 8 Fantastic Resources for
lists in Cloud Firestore Better PHP I’m one of THOSE Tech Interview Prep
Dana Hartweg in The Startup Scott Sandler in Several People Mr. Anne Dev Michael Vinh Xuan Thanh in
Are Coding Better Programming

Testing guide for Cloud Guide: OKD 4.5 Single What do I do in between Bare Metal Beginner: A
Firestore functions and Node Cluster re:Invent live streams? Note
security rules Craig Robinson in The Startup
Build Lambda functions William McCann
Dana Hartweg in The Startup Leon Stigter in
HackerNoon.com

About Write Help Legal

You might also like