0% found this document useful (0 votes)
99 views26 pages

DMA Details

Uploaded by

karthikp207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views26 pages

DMA Details

Uploaded by

karthikp207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

DIRECT MEMORY ACCESS

[DMA]
What is Direct Memory Access?
Hardware mechanism that allows peripheral components to transfer their I/O data
directly to and from main memorywithout the need to involve the system processor.

Why is Direct Memory Access important?


Use of this mechanism can greatly increase throughputto and from a device,
because a great deal of computational overhead is eliminated

What is the downside?


Hardware support is required – DMA controllers DMA “steals” cycles from the
processor Synchronization mechanisms must be provided to avoid accessing
non-updated information from RAM

By Ezrah Buki On LinkedIn


Page 1 of 25
What is Direct Memory Access?
DMA is a capability provided by some computer bus architectures, including PCI, PCMCIA
and CardBus, whichallows data to be sent directly from an attached device
to the memory on the host, freeing the CPU from involvement with the data transfer
and thus improving thehost's performance.

DMA Programming
The programming of a device's DMA controller is hardwarespecific. Normally, the OS needs to have
the local
device address, the physical memory address on the PC,and the size of the memory
block to transfer.
Then the register that initiates the transfer is set.

By Ezrah Buki On LinkedIn


Page 2 of 25
By Ezrah Buki On LinkedIn
Page 3 of 25
By Ezrah Buki On LinkedIn
Page 4 of 25
Elaboration
DMA transfers overcome the problem of occupying theCPU for the entire time it's
performing a transfer.

The CPU initiates the transfer, then it executes other ops while the transfer is in progress,
finally it receives an interrupt from the DMA controller when the transfer is done

Hardware using DMA: disk drives, graphics cards,network cards, sound cards

DMA can lead to cache coherency problems


If a CPU has a cache and external memory, then the datathe DMA controller has access
to (stored in RAM) may not be updated with the correct data stored in the cache.

By Ezrah Buki On LinkedIn


Page 5 of 25
Cache Coherency Solutions
Cache-coherent systems:
external writes are signaled to the cache controller which
performs a cache invalidation for incoming DMA transfersor cache flush for outgoing
DMA transfers (done by hardware)

Non-coherent systems:
OS ensures that the cache lines are flushed before an outgoing DMA transfer is
started and invalidated beforea memory range affected by an incoming DMA transfer
is accessed. The OS makes sure that the memory
range is not accessed by any running threads in themeantime.

By Ezrah Buki On LinkedIn


Page 6 of 25
DMA Transfer
Overview (input example)

• software: asks for data (e.g. read called)


1. Device driver allocates a DMA buffer, sends signal todevice indicating where to
send the data, sleeps

2. Device writes data to DMA buffer, raises interrupt whenfinished


3. Interrupt handler gets data from DMA buffer,acknowledges interrupt,
awakens software toprocess the data

By Ezrah Buki On LinkedIn


Page 7 of 25
DMA Transfer
Overview (input example)

• hardware: asynchronously pushes data to the system


May push data even if no process is listening!

1. hardware raises an interrupt to announce that new datahas arrived


2. interrupt handler allocates a buffer, tells the hardwarewhere to transfer the data
3. device writes the data to the buffer, raises anotherinterrupt when transfer is done
4. interrupt handler dispatches the new data, awakens any relevant process, and takes care of
housekeeping

By Ezrah Buki On LinkedIn


Page 8 of 25
DMA Transfer
Overview (input example – network transfers)

• hardware: must deal with continuous data flow


Use a circular (ring) buffer in memory shared by deviceand the processor

1. incoming packet placed in next available bufferIn the ring


2. interrupt is raised by the device
3. device driver sends packet to kernel code thatwill process it
4. device driver inserts a new buffer into the ring (note: buffer allocation occurs at
initialization sothe insertion of a buffer is just the selection of an already
allocated buffer for the ring)

By Ezrah Buki On LinkedIn


Page 9 of 25
DMA Transfer
Overview (input example – network transfers)

• pseudo code for read using circular buffer


Err_Type Procedure read(data_type data) { if (empty_flag) return
(ERR_BUF_EMPTY);else {
memory(read_ptr) = data;full_flag = false;
read_ptr = (((read_ptr-B)+1)mod N)+B;
if (read_ptr == write_ptr) empty_flag = true;return(Err-OK);
}
}

By Ezrah Buki On LinkedIn


Page 10 of 25
DMA Transfer
Graphically:

By Ezrah Buki On LinkedIn


Page 11 of 25
DMA Transfer

By Ezrah Buki On LinkedIn


Page 12 of 25
Allocating a DMA Buffer
Allocation Buffers:
contiguous buffer: contiguous block of memory allocatedscatter/gather: allocated buffer
can be fragmented in the physical memory and does not need to be allocated
contiguously. The allocated physical memory blocks
are mapped to a contiguous buffer in the calling process's virtual address
space, thus enabling easyaccess to the allocated physical memory blocks.
bounce buffer: to allow devices with limited addressingto access all of virtual address
space. A bounce buffer resides in memory low enough for a device to copy from
and write data to. It is then copied to the desired user page in high memory

By Ezrah Buki On LinkedIn


Page 13 of 25
Allocating a DMA Buffer
Considerations PCI:

• Must occupy contiguous pages in physical memory due to PCI bus requirement
using physical addresses

• Note: other buses (e.g. Sbus) use virtual memoryaddresses for the transfer
• The right kind of memory must be allocated since not allmemory zones are suitable:
high memory may not work with some devices because of limited address space

• Memory can be allocated at boot or runtime but driverscan only allocate at runtime

By Ezrah Buki On LinkedIn


Page 14 of 25
Allocating a DMA Buffer
How to Allocate (low level):

• Use mem=xxx to reserve memory beginning at xxx


- done at bootThen
use
dmabuf = ioremap(0xFF00000, 0x100000);to create the buffer in the driver

• Allocate from the DMA zone (16 MB)struct page *buf;


buf = alloc_pages( GFP_DMA, 3);
where 3 is the order and means (1 << 3) pages (fromthe buddy system
allocator)

• Use scatter/gather I/O if the device supports it

By Ezrah Buki On LinkedIn


Page 15 of 25
Allocating a DMA Buffer
How to Allocate (generic DMA layer):

• DMA based hardware uses bus addresses


- these do not match with PCI physical addresses

• Can convert at low-level but then risk inconsistencyacross systems


• Also, different systems have different expectations ofcache coherency – hence
allocate at generic layer

• Device may have DMA limitations – 32 bit addressingis assumed but may be
changed with
int dma_set_mask(struct device *dev, u64 mask);
(e.g. for 24 bit addressing: mask = 0xFFFFFF)

By Ezrah Buki On LinkedIn


Page 16 of 25
Allocating a DMA Buffer
Creating a buffer and a bus address for the device:

• If the device has an IOMMU a set of mapping registersis provided


• A bounce buffer may be necessary – if a driver tries to perform DMA on an address
that is not reachable by dev

• Cache coherency:
copies of recently accessed memory areas are in cacheif device writes to memory,
cache area is invalidated so it will have to be paged in
If device reads data from memory, cache flushed out first

• Generic DMA layer ensures all of above are not a problemover many architectures
provided some rules are obeyed

By Ezrah Buki On LinkedIn


Page 17 of 25
Allocating a DMA Buffer
Rules for generic DMA layer – coherent mapping:

• Data type dma_addr_t represents a bus address It must not be manipulated


by the driver – it can bepassed to the device

• Set up the mapping (and buffer) with this:


void *dma_alloc_coherent(struct device *, size_t, dma_addr_t *, int);
allocates uncached, unbuffered memory for a device forperforming DMA. Allocates pages, returns
the
CPU-viewed (virtual) address, and sets the third arg to thedevice-viewed address. Buffer
is automatically placed where the device can get at it.

• Free the mapping with this:


void dma_free_coherent(struct device *, size_t, void *, dma_addr_t);

By Ezrah Buki On LinkedIn


Page 18 of 25
Allocating a DMA Buffer
Rules for generic DMA layer – streaming DMA mapping:
• Different from coherent mapping because the mappings deal with addresses that
were chosen apriori. If IOMMUexists, registers may hogged up by coherent mapping,
but streaming is single shot, registers are released

• Set up a single buffer with this:


dma_addr_t dma_map_single(struct device*, void*, size_t,
enum dma_data_direction);
Direction choices: DMA_TO_DEVICE, DMA_FROM_DEVICEDMA_BIDIRECTIONAL
void*is the buffer virtual address, dma_addr_tis theReturned bus address for the device

• Free the mapping with this:


void dma_unmap_single(struct device*, dma_addr_t, size_t,
enum dma_data_direction);

By Ezrah Buki On LinkedIn


Page 19 of 25
Allocating a DMA Buffer
Rules for generic DMA layer – streaming DMA mapping:
• Buffer can only be used in the direction specified
• A mapped buffer belongs to the device, not the processorThe device driver must keep
hands off the buffer until
it is unmapped

• A buffer used to send data to a device must containthe data before it is


mapped

• The buffer must not be unmapped while DMA is stillactive, or serious system
instability is guaranteed.

By Ezrah Buki On LinkedIn


Page 20 of 25
Allocating a DMA Buffer
Rules for generic DMA layer – streaming DMA mapping:

By Ezrah Buki On LinkedIn


Page 21 of 25
Allocating a DMA Buffer
Rules for generic DMA layer – scatter/gather mapping:

• Send contents of several buffers over DMACould send then one


at a time: map each
Or with scatter/gather, can send them all at once (speed)

• Many devices can accept a scatterlist o


• f array pointersand lengths
• But scatterlist entries must be of page size (except ends)

By Ezrah Buki On LinkedIn


Page 22 of 25
Using a scatterlist DMA transfer
Steps:

• struct scatterlist { /* in asm/scatterlist.h */unsigned long page_link;


unsigned int offset;
unsigned int length; dma_addr_t
dma_address;unsigned
int dma_length;
};

• Create array sgof scatterlist entries and fill them inLet nentbe the number of
entries

• Call
int dma_map_sg(device, sg, nent, direction)
returns the number of DMA buffers to send (<= nent)

• When done do this:


void dma_unmap_sg(device, sg, nent, direction);

By Ezrah Buki On LinkedIn


Page 23 of 25
PCI Double Address Cycle (DAC)
• The PCI bus supports 64-bit addressing (DAC)
• The generic DMA layer does not support this mode
- it is a PCI-specific feature
- many implementations of DAC are buggy
- DAC is slower than a regular 32-bit DMA
• But DAC could be useful with large buffer in high memorythen PCI-specific routines must
be used

• Use a mask m(must return 0 to continue)


int pci_dac_set_dma_mask(struct pci_dev *pdev, u64 m);

• Call this to map:


dma64_addr_t pci_dac_page_to_dma
(struct pci_dev *pdev, struct page *page,unsigned long offset, int
direction);

• No need to unmap – no external resources used(mapping pages directly)


By Ezrah Buki On LinkedIn
Page 24 of 25
Examples

• dad.c
• usb_skel.c

By Ezrah Buki On LinkedIn


Page 25 of 25

You might also like