100% found this document useful (1 vote)

3K views31 pages

Stream Fusion For Haskell Arrays

Arrays have traditionally been an awkward data structure for Haskell programmers. Despite the large number of array libraries available, they have remained relatively awkward to use in comparison to the rich suite of purely functional data structures, such as fingertrees or finite maps. Arrays have simply not been first class citizens in the language. In this talk we’ll begin with a survey of the more than a dozen array types available, including some new matrix libraries developed in the past year. I’ll then describe a new efficient, pure, and flexible array library for Haskell with a list like interface, based on work in the Data Parallel Haskell project, that employs stream fusion to dramatically reduce the cost of pure arrays. The implementation will be presented from the ground up, along with a discussion of the entire compilation process of the library, from source to assembly. Source: http://www.galois.com/blog/2008/08/28/galois-tech-talks/

Uploaded by

Don Stewart

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

3K views31 pages

Stream Fusion For Haskell Arrays

Uploaded by

Don Stewart

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Stream Fusion for Haskell Arrays

Don Stewart
Galois Inc
Haskell's Data Types
● Beautiful algebraic data types:

data Set a
= Tip
| Bin !Int a !(Set a) !(Set a)

● Concise notation, inductive reasoning, type math!

● Polymorphic, strongly typed, side effect free
● Efficient. GCd. Strict, or lazy, or roll your own
● Pointers, pointers...
But for real speed...
Sometimes we need unboxed, flat structures:
Arrays in Haskell
biodiversity!
Data.Array Foreign.ForeignPtr
Data.Array.Diff Data.ByteString
Data.Array.IO Data.ByteString.Lazy
Data.Array.Storable Data.PackedString
Data.Array.ST Data.StorableVector
Data.Array.Unboxed Data.Vec
Data.Array.CArray BLAS.Matrix
Data.ArrayBZ Data.Packed
Foreign.Array Data.Packed.Vector
Foreign.Ptr Data.Packed.Matrix
The Perfect Array Type

1.Very, very efficient. Ruthlessly fast.

2.Polymorphic
3.Pure
4.Rich list-like API
5.Compatible with C arrays, other arrays
Data Parallel Haskell
● Project to target large multicore systems:
Chakravarty, Leshchinksiy, Peyton-Jones, Keller, Marlow
● Parallel, distributed arrays, with good interface
● Built from flat, unlifted arrays
● The core of a better array type for mortals
● Built around array fusion
“Stream Fusion: From Lists to Streams to Nothing at All”
Coutts, Leshchinskiy, Stewar.t 2007.
● Key technique for making arrays flexible and fast
uvector: fast, flat, fused arrays
Two data types: mutable arrays and pure arrays
data BUArr e =
BUArr !Int
!Int
ByteArray#

data MBUArr s e =
MBUArr !Int
(MutableByteArray# s)

● Fill the mutable array, freeze it, and get free substrings,
and persistance.
● Low level Haskell
Primitive operations
length :: BUArr e -> Int
length (BUArr _ n _) = n

class UAE e where

sizeBU :: Int -> e -> Int
indexBU :: BUArr e -> Int -> e

readMBU :: MBUArr s e -> Int -> ST s e

writeMBU :: MBUArr s e -> Int -> e -> ST s ()

newMBU :: UAE e => Int -> ST s (MBUArr s e)

Conversions
Zero-copying conversion from mutable to pure

unsafeFreezeMBU
:: MBUArr s e -> Int -> ST s (BUArr e)

unsafeFreezeMBU (MBUArr m mba) n =

checkLen "unsafeFreezeMBU" m n $
ST $ \s ->
(# s, BUArr 0 n (unsafeCoerce# mba) #)

Bounds checking compiled out if -funsafe

Array element instances
Simple per-type representation choices
instance UAE () where
sizeBU _ _ = 0
indexBU (BUArr _ _ _) (I# _) = ()

readMBU (MBUArr _ _) (I# _) = ST $ \s ->

(# s, () #)
writeMBU (MBUArr _ _) (I# _) () = ST $ \s ->
(# s, () #)
Goal 1: Efficiency
Can be a bit fancier...
instance UAE Bool where
readMBU (MBUArr n mba) i@(I# i#) = ST $ \s ->
case readWordArray# mba (bOOL_INDEX i#) s of
(# s2, r# #) ->
(# s2, (r# `and#` bOOL_BIT i#)
`neWord#` int2Word# 0# #

bOOL_INDEX :: Int# -> Int#

#if SIZEOF_HSWORD == 4
bOOL_INDEX i# = i# `uncheckedIShiftRA#` 5#
#elif SIZEOF_HSWORD == 8
bOOL_INDEX i# = i# `uncheckedIShiftRA#` 6#
#endif
Relax. Low level stuff done.
Goal 2: polymorphic
Abstract over the primitive arrays
class UA e where

data UArr e
data MUArr e :: * -> *

lengthU :: UArr e -> Int

indexU :: UArr e -> Int -> e

lengthMU :: MUArr e s -> Int

newMU :: Int -> ST s (MUArr e s)
freezeMU :: MUArr e s -> Int -> ST s (UArr e)

readMU :: MUArr e s -> Int -> ST s e

writeMU :: MUArr e s -> Int -> e -> ST s ()
Goal 3a: Pure
Introducing UArr .. purely!

newU :: UA e
=> Int
-> (forall s. MUArr e s -> ST s Int)
-> UArr e

newU n init =
runST (do
ma <- newMU n
n' <- init ma
freezeMU ma n'
)

Mutation encapsulate in ST monad.

Flexible array representations
instance UA () where
newtype UArr () = UAUnit Int
newtype MUArr () s = MUAUnit Int

lengthU (UAUnit n) = n
indexU (UAUnit _) _ = ()
sliceU (UAUnit _) _ n = UAUnit n

lengthMU (MUAUnit n) = n
newMU n = return $ MUAUnit n
readMU (MUAUnit _) _ = return ()
writeMU (MUAUnit _) _ _= return ()

freezeMU (MUAUnit _) n = return $ UAUnit n

Goal 4: list-like operations
data (:*:) a b = !a :*: !b

instance (UA a, UA b) => UA (a :*: b) where

data UArr (a :*: b)

= UAProd !(UArr a) !(UArr b)

data MUArr (a :*: b) s

= MUAProd !(MUArr a s) !(MUArr b s)

indexU (UAProd l r) i =
indexU l i :*: indexU r i
Support for numeric stuff
instance (RealFloat a, UA a)
=> UA (Complex a) where

newtype UArr (Complex a)

= UAComplex (UArr (a :*: a))
newtype MUArr (Complex a) s
= MUAComplex (MUArr (a :*: a) s)

indexU (UAComplex arr) i =

case indexU arr i of
(a :*: b) -> a :+ b
But that's not the end

• Strict, pure arrays are a bit too inefficient

• Too much copying, not enough sharing
• Impure languages would just mutate inplace
• But we need to find some other way to deforest.
Goal 1&2: Efficiency
Stream Fusion
data Step s a = Done
| Skip !s
| Yield !a !s

data Stream a = exists s.

Stream (s -> Step s a) !s Int
● Abstract sequence transformers
● Non-recursive

● General fusion rule for removing intermediates

● We'll convert arrays into abstract sequences

● Non-recursive things we can optimise ruthlessly

Conversion to and from arrays
streamU :: UA a => UArr a -> Stream a

streamU arr = Stream next 0 n

where
n = lengthU arr

next i | i == n = Done
| otherwise = Yield (arr `indexU` i) (i+1)

unstreamU :: UA a => Stream a -> UArr a

unstreamU st@(Stream next s n) =

newDynU n (\marr -> unstreamMU marr st)
Convert recursive array loops
to non-recursive streams

mapU :: (UA e, UA e')

=> (e -> e') -> UArr e -> UArr e'
mapU f = unstreamU . mapS f . streamU

headU :: UA e => UArr e -> e

headU = headS . StreamU

lastU :: UA e => UArr e -> e

lastU = foldlU (flip const)
The fusion rule
● Use rules to remove redundant conversions

"streamU/unstreamU" forall s.
streamU (unstreamU s) = s

● Compositions of non-recursive functions left over

● Then combine streams using general

optimisations
● Arrays at the end will be fused from the

combined stream pipeline

Filling a mutable array
unstreamMU ::
UA a => MUArr a s -> Stream a -> ST s Int

unstreamMU marr (Stream next s n) = fill s 0

where
fill s !i = case next s of
Done -> return i
Skip s' -> s' `seq` fill s' i
Yield x s' -> s' `seq` do
writeMU marr i x
fill s' (i+1)
New streams
emptyS :: Stream a
emptyS = Stream (const Done) () 0

replicateS :: Int -> a -> Stream a

replicateS n x = Stream next 0 n
where
next i | i == n = Done
| otherwise = Yield x (i+1)
enumFromToS
:: (Ord a, RealFrac a) => a -> a -> Stream a

enumFromToS n m = Stream next n (truncate (m - n))

where
lim = m + 1/2
next s | s > lim = Done
| otherwise = Yield s (s+1)
Transforming streams
mapS :: (a -> b) -> Stream a -> Stream b
mapS f (Stream next s n) = Stream next' s n
where
next' s = case next s of
Done -> Done
Skip s' -> Skip s'
Yield x s' -> Yield (f x) s'

foldS :: (b -> a -> b) -> b -> Stream a -> b

foldS f z (Stream next s _) = fold z s
where
fold !z s = case next s of
Yield x !s' -> fold (f z x) s'
Skip !s' -> fold z s'
Done -> z
Zipping streams

zipWithS
:: (a -> b -> c) -> Stream a -> Stream b -> Stream c

zipWithS f (Stream next1 s m)

(Stream next2 t n) = Stream next (s :*: t) m
where
next (s :*: t) = case next1 s of
Done -> Done
Skip s' -> Skip (s' :*: t)
Yield x s' -> case next2 t of
Done -> Done
Skip t' -> Skip (s :*: t')
Yield y t' -> Yield (f x y) (s' :*: t')
Arrays to streams to nothing at all ...
Future
● Allow users to pick and choose between fused
or direct implementations
● Write some big programs in this style
● Goal 4: more conversions from other array
types (e.g. ByteStrings, Ptr a)
● Conversions to and from other sequence types
via streams – no overhead for the conversion
● DPH's goals: parallel nested arrays, fusible
mutable arrays.
OM NOM NOM NOM

It's on hackage.haskell.org

Haskell 03
No ratings yet
Haskell 03
3 pages
Haskell Solutions
No ratings yet
Haskell Solutions
31 pages
Multicore Haskell Now!
100% (3)
Multicore Haskell Now!
93 pages
Building A Business With Haskell: Case Studies: Cryptol, HaLVM and Copilot
100% (3)
Building A Business With Haskell: Case Studies: Cryptol, HaLVM and Copilot
32 pages
Loop Fusion in Haskell
No ratings yet
Loop Fusion in Haskell
90 pages
CS571 sp24 Lecture16
No ratings yet
CS571 sp24 Lecture16
71 pages
Basic Haskell Cheat Sheet: Declaring Types and Classes Common Functions
No ratings yet
Basic Haskell Cheat Sheet: Declaring Types and Classes Common Functions
2 pages
Haskell Exercises Solutions
No ratings yet
Haskell Exercises Solutions
6 pages
Fsharpcheatsheet
No ratings yet
Fsharpcheatsheet
7 pages
Week 4-7 Nptel Haskell HRST
No ratings yet
Week 4-7 Nptel Haskell HRST
16 pages
Haskell
No ratings yet
Haskell
33 pages
ch02 Arrays Nosolution
No ratings yet
ch02 Arrays Nosolution
45 pages
Repa
No ratings yet
Repa
18 pages
Haskell Made Easy
100% (1)
Haskell Made Easy
479 pages
Graph
No ratings yet
Graph
28 pages
Part1 Array
No ratings yet
Part1 Array
120 pages
Programming in Haskell Solutions To Exer
No ratings yet
Programming in Haskell Solutions To Exer
31 pages
Cs9251 CD Unit III Notes
No ratings yet
Cs9251 CD Unit III Notes
20 pages
Final Paper
No ratings yet
Final Paper
7 pages
Technical Interview Study Guide
No ratings yet
Technical Interview Study Guide
18 pages
29 TAC Examples
No ratings yet
29 TAC Examples
8 pages
Programming Language
No ratings yet
Programming Language
23 pages
Haskell Introduction
100% (1)
Haskell Introduction
26 pages
Fun With Type Funs
No ratings yet
Fun With Type Funs
60 pages
Haskell Tokenizer
No ratings yet
Haskell Tokenizer
10 pages
Minister of Education, Culture and Research of Moldova Technical University of Moldova Software Engineering and Automatics Department
No ratings yet
Minister of Education, Culture and Research of Moldova Technical University of Moldova Software Engineering and Automatics Department
8 pages
19CSE313 - PPL - Mid-Term Set1 Solution
No ratings yet
19CSE313 - PPL - Mid-Term Set1 Solution
5 pages
Haskell Course PDF
No ratings yet
Haskell Course PDF
30 pages
Recursion
No ratings yet
Recursion
35 pages
CD End Sem QP Answers
No ratings yet
CD End Sem QP Answers
7 pages
Haskell Arrays Accelerated With GPUs
100% (1)
Haskell Arrays Accelerated With GPUs
47 pages
Funciones Importantes
No ratings yet
Funciones Importantes
5 pages
A Taste of Haskell
No ratings yet
A Taste of Haskell
119 pages
Writing Fast Haskell
No ratings yet
Writing Fast Haskell
49 pages
Arrays
No ratings yet
Arrays
37 pages
Oopsla - Introduction
No ratings yet
Oopsla - Introduction
54 pages
Julia Programming Languege
No ratings yet
Julia Programming Languege
15 pages
Data Structure Module 2
No ratings yet
Data Structure Module 2
25 pages
Unit 4
No ratings yet
Unit 4
57 pages
22 Monads STM
100% (1)
22 Monads STM
76 pages
Unit 2
No ratings yet
Unit 2
54 pages
2010 03 21 Dan - Vasicek.functional Programming Using Haskell
No ratings yet
2010 03 21 Dan - Vasicek.functional Programming Using Haskell
54 pages
Primitive Operations: Assumed To Take A Constant Amount of Time in The RAM Model
No ratings yet
Primitive Operations: Assumed To Take A Constant Amount of Time in The RAM Model
7 pages
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea
No ratings yet
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea
19 pages
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea
No ratings yet
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea
19 pages
Iterative Data Flow Analysis
No ratings yet
Iterative Data Flow Analysis
88 pages
Kuvempu BSIT6p With Solutions
No ratings yet
Kuvempu BSIT6p With Solutions
11 pages
PLP 24
No ratings yet
PLP 24
43 pages
Monads
100% (1)
Monads
18 pages
09 Pointers Arrays
No ratings yet
09 Pointers Arrays
34 pages
COMP4500 - 7500 - 2013, Sem 2
No ratings yet
COMP4500 - 7500 - 2013, Sem 2
8 pages
SML Tutorial PDF
No ratings yet
SML Tutorial PDF
34 pages
Haskell Ucs 0.4 PDF
No ratings yet
Haskell Ucs 0.4 PDF
2 pages
SE-Comps SEM3 DS-CBCGS DEC19 SOLUTION
No ratings yet
SE-Comps SEM3 DS-CBCGS DEC19 SOLUTION
30 pages
Three Improvements To The Reduceron: Matthew Naylor and Colin Runciman University of York
No ratings yet
Three Improvements To The Reduceron: Matthew Naylor and Colin Runciman University of York
50 pages
Algebraic
No ratings yet
Algebraic
45 pages
Lab10 - Arrays2 - Sec450 C#
No ratings yet
Lab10 - Arrays2 - Sec450 C#
9 pages
9691-CIE-Answers (3.4) - Data Representation and Structures
No ratings yet
9691-CIE-Answers (3.4) - Data Representation and Structures
17 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Engineering Large Projects in A Functional Language
No ratings yet
Engineering Large Projects in A Functional Language
49 pages
Multicore Programming in Haskell
No ratings yet
Multicore Programming in Haskell
55 pages
Evaluation Strategies and Synchronization: Things To Watch For
No ratings yet
Evaluation Strategies and Synchronization: Things To Watch For
13 pages
Galois Tech Talk: A Scalable Io Manager For GHC
No ratings yet
Galois Tech Talk: A Scalable Io Manager For GHC
22 pages
A Wander Through GHC's New IO Library
No ratings yet
A Wander Through GHC's New IO Library
22 pages
The Design and Implementation of Xmonad
100% (1)
The Design and Implementation of Xmonad
38 pages
Supercompilation For Haskell
No ratings yet
Supercompilation For Haskell
26 pages
Engineering Large Projects in Haskell: A Decade of FP at Galois
100% (3)
Engineering Large Projects in Haskell: A Decade of FP at Galois
46 pages
Integrated Development Environments: Accelerating XML Application Development in The Enterprise
No ratings yet
Integrated Development Environments: Accelerating XML Application Development in The Enterprise
18 pages
Database Query Criteria
No ratings yet
Database Query Criteria
25 pages
#4 UML Viva Q&A by AK-1 (E-Next - In) PDF
No ratings yet
#4 UML Viva Q&A by AK-1 (E-Next - In) PDF
20 pages
Fycs Oops Manual
No ratings yet
Fycs Oops Manual
72 pages
Section - A Fundamentals of Computer Science Section - C Informatics Practices (Java)
No ratings yet
Section - A Fundamentals of Computer Science Section - C Informatics Practices (Java)
2 pages
Regulatory Guideline For Mobile Banking App Security
100% (1)
Regulatory Guideline For Mobile Banking App Security
3 pages
Conditional Branching
No ratings yet
Conditional Branching
15 pages
Servicenow: Servicenow Certified Application Developer
No ratings yet
Servicenow: Servicenow Certified Application Developer
52 pages
IDesign C# Coding Standard 2.4
0% (1)
IDesign C# Coding Standard 2.4
27 pages
Muhammad Resume
No ratings yet
Muhammad Resume
5 pages
IMS DB StudentWorkBook
100% (2)
IMS DB StudentWorkBook
60 pages
MCA - Project Documentation Guidelines 2021-2022
No ratings yet
MCA - Project Documentation Guidelines 2021-2022
4 pages
Csc8710 Software Design and Modelling
No ratings yet
Csc8710 Software Design and Modelling
6 pages
Windows Phone 8 Application Security Slides
No ratings yet
Windows Phone 8 Application Security Slides
43 pages
Guia de Instalação DTX Studio Implant 3.3
No ratings yet
Guia de Instalação DTX Studio Implant 3.3
14 pages
11 User Defined Functions
No ratings yet
11 User Defined Functions
69 pages
Sunil Yadav
No ratings yet
Sunil Yadav
2 pages
AP® Computer Science AB Syllabus Course Overview (C1)
No ratings yet
AP® Computer Science AB Syllabus Course Overview (C1)
8 pages
IA05 - Create General Task List
No ratings yet
IA05 - Create General Task List
8 pages
Technical Interview Questions Info
No ratings yet
Technical Interview Questions Info
3 pages
My Research Helper:: A Web-Base Scientific Paper Management System
No ratings yet
My Research Helper:: A Web-Base Scientific Paper Management System
15 pages
Web Services Mock Test
No ratings yet
Web Services Mock Test
9 pages
Python Presentation Assignment
No ratings yet
Python Presentation Assignment
5 pages
SAP S/4HANA 1909 FPS00 Fully-Activated Appliance: Print Form Customization
No ratings yet
SAP S/4HANA 1909 FPS00 Fully-Activated Appliance: Print Form Customization
15 pages
Java Swing Tutorial - Javatpoint
No ratings yet
Java Swing Tutorial - Javatpoint
5 pages
ISC 2016 Computer Science Theory Paper 1 Solved Paper
0% (1)
ISC 2016 Computer Science Theory Paper 1 Solved Paper
27 pages
Human Aspects of Agile Transition in Traditional Organizations PDF
No ratings yet
Human Aspects of Agile Transition in Traditional Organizations PDF
13 pages
Cs Project
No ratings yet
Cs Project
13 pages
Lecture-3 - Architecture of Distributed Systems F23
No ratings yet
Lecture-3 - Architecture of Distributed Systems F23
20 pages
OOP Mini Project 03
No ratings yet
OOP Mini Project 03
4 pages

Stream Fusion For Haskell Arrays

Uploaded by

Stream Fusion For Haskell Arrays

Uploaded by

Stream Fusion for Haskell Arrays

● Concise notation, inductive reasoning, type math!

1.Very, very efficient. Ruthlessly fast.

class UAE e where

readMBU :: MBUArr s e -> Int -> ST s e

newMBU :: UAE e => Int -> ST s (MBUArr s e)

unsafeFreezeMBU (MBUArr m mba) n =

Bounds checking compiled out if -funsafe

readMBU (MBUArr _ _) (I# _) = ST $ \s ->

bOOL_INDEX :: Int# -> Int#

lengthU :: UArr e -> Int

lengthMU :: MUArr e s -> Int

readMU :: MUArr e s -> Int -> ST s e

Mutation encapsulate in ST monad.

freezeMU (MUAUnit _) n = return $ UAUnit n

instance (UA a, UA b) => UA (a :*: b) where

data UArr (a :*: b)

data MUArr (a :*: b) s

newtype UArr (Complex a)

indexU (UAComplex arr) i =

• Strict, pure arrays are a bit too inefficient

data Stream a = exists s.

● General fusion rule for removing intermediates

● We'll convert arrays into abstract sequences

● Non-recursive things we can optimise ruthlessly

streamU arr = Stream next 0 n

unstreamU :: UA a => Stream a -> UArr a

unstreamU st@(Stream next s n) =

mapU :: (UA e, UA e')

headU :: UA e => UArr e -> e

lastU :: UA e => UArr e -> e

● Compositions of non-recursive functions left over

combined stream pipeline

unstreamMU marr (Stream next s n) = fill s 0

replicateS :: Int -> a -> Stream a

enumFromToS n m = Stream next n (truncate (m - n))

foldS :: (b -> a -> b) -> b -> Stream a -> b

zipWithS f (Stream next1 s m)

You might also like