**Authors**
Juan Carlos Saenz-Carrasco
Mike Stannett

**License**
CC-BY-3.0

Overcoming Non Distributivity: A Case Study in Functional Programming Juan Carlos Saenz-Carrasco ∗ Mike Stannett The University of Sheffield, Department of Computer Science, Regent Court, 211 Portobello, Sheffield S1 4DP, United Kingdom jcsaenzcarrasco1@sheffield.ac.uk m.stannett@sheffield.ac.uk Among hard problems in Computer Science are those that are easy to write but hard to run, especially when the data structures that represent them are persistent. In this paper we present two different problems, in the fashion of case studies, which share a common algebraic problem: ‘multiplication’ is not distributive over ‘addition’ (for example, bi-criteria and knapsack problems). Our proposed solution comprises two simple non-strict pure functions within a standard search algorithm, and we show that this is enough to overcome the lack of distributivity in these situations. Also, we provide benchmarking between the strict and non-strict versions. We believe that this simple solution provides a useful exercise for students of functional programming, since it fruitfully combines several different aspects of the paradigm—persistent data structures, polymorphic data types, and different evaluations are all involved. 1 Introduction One of the aims of functional programming (FP) is to write down the simplest and clearest specification to solve a problem without regard to how efficient the result might be. In the vast repertoire of hard problems in Computer Science, there are some (reasonably tractable) problems that can be stated in simple and clear terms, but standard solutions of the sort that might be generated by a Haskell compiler can potentially take an undesirable amount of time to run. In this paper we describe a case study—suitable for discussion in graduate level programming classes—showing how this situation can arise in the context of algebraic situations in which key alge- braic properties fail (in our case, failure of multiplication to distribute over addition), and how a change of approach can render them more easily soluble. This case study focuses on the bottleneck shortest path (BSP) problem. We analyse the problem, demonstrate the failure of distributivity, and illustrate how our approach can be applied. The remainder of this paper is as follows. Sect. 2 outlines some key mathematical definitions and their Haskell implementations. The problem we area addressing is introduced through an example in Sect. 3, which is solved by the proposal in Sect. 4. The core of the paper is in Sect. 5 where implemen- tations are given for solving BSP. We conclude by benchmarking various test cases in Sect. 6. 2 Fundamental definitions For the purposes of this overview we limit BSP to the context of weighted directed acyclic graphs, although it can also be considered for other types of graph [2]. We write N for the set of natural numbers ∗ Research supported by the Consejo Nacional de Ciencia y Tecnologı́a, CONACYT (the Mexican National Council for Science and Technology) under grant 411550, scholar 580617 and CVU 214885 c JC Saenz, M Stannett Submitted to: This work is licensed under the TFPIE 2017 Creative Commons Attribution License. 2 Overcoming Non-Distributivity in Functional Programming type Graph = Table [Vertex] -- Adjacency list representation of a graph, mapping each vertex to its list of successors. type Table a = Array Vertex a -- Table indexed by a contiguous set of vertices. type Vertex = Int -- Abstract representation of vertices. Figure 1: Representation of graphs in Data . Graph. (From [4].) (including zero), and R+ for the set of positive real numbers. Definition 1. A weighted directed graph is a pair G = (V, E) comprising a finite non-empty set V of vertices and a finite non-empty set E ⊆ V ×V of edges, together with a weight function w : E → R+ . A path is a non-empty graph P = (V, E) of the form V = {x0 , x1 , . . . , xk } E = {(x0 , x1 ), (x1 , x2 ), . . . , (xk−1 , xk )}, where the xi are all distinct. The vertices x0 and xk are linked by P and are called its ends. Note that k is allowed to be zero, in which case E is empty and we have a singleton path. In the basic Haskell representation (Fig. 1) each vertex is assigned an integer value; this allows vertices to be used as array indices. The graph itself is represented as an array which captures, for each index (i.e. vertex), which vertices can be reached from it directly. Since a path is fully determined by the order in which its vertices are arranged, we represent paths in Haskell by declaring type Path = [Vertex]. For the purposes of this case study, we define BSP as a bi-criteria problem. The first (and leading) criterion is to determine the graph’s maximum capacity (or widest path); the second is to determine its shortest path (as a tie breaker for the first criterion). Definition 2 (The Bottleneck Shortest Path Problem (BSP)). Given a graph G = (V, E) and source and target vertices s 6= t in V , we need to determine the shortest maximum capacity path from s to t, together with its capacity and length. Capacity and distance are given by label functions c : E → C and d : E → D, where C and D are value sets representing capacities and distances, respectively. For practical purposes we typically take C = D = N. 2.1 Optimality Criteria By optimality criteria we mean requirements involving minimality and maximality targets. For example, Dijkstra’s and Floyd-Roy-Warshall’s pathfinding algorithms seek to minimise path length. Writing ↓ for the minimization operator over a set of values, we can formalise this idea by saying that the shortest path solution (i.e. length, distance, time or cost) between two nodes is given by applying ↓ to a set of path lengths—where these lengths are obtained by adding together the values (i.e. labels) attached to each edge in the path. More generally, we need to identify how edge-weights will be “added” in order to calculate the overall cost of traversing a path. Having done so, we can then apply minimization and/or maximization (↑) operators to an approprioate set of “path costs” to find those paths which satisfy the constraint(s) in question. JC Saenz, M Stannett 3 In this document we focus specifically on distance and capacity constraints, so we assume that edge weights are given as capacity/distance pairs. For example, given source and target vertices s and t in a (c1 ,d1 ) (cn ,dn ) graph G = (E, v), and a path p ≡ s −−−− → . . . −−−−→ t, 1. the distance associated with p is given by ∑ di , and the shortest path is given by applying ↓ to the st of all such sums. Thus, solving the shortest-path problem involving applying the (+) and ↓ operations. 2. the capacity associated with a path is the smallest capacity of its edges. Thus, finding the ca- pacity of a path involves computation of ↓ ci . Finding the path with maximum capacity involves application of ↑ to the set of all such path capacities. 3. solving BSP therefore requires computations involving ↑, ↓ and (+) (as well as the projection operators from (ci , di ) to ci and di , respectively). 3 The non-distributivity problem: an example Suppose the problem is to find, from vertex 1 to vertex 4 from Figure 2 the maximum capacity among all shortest paths, i.e. the BSP. Assume further that labels over the edges are natural numbers having the type (Capacity,Distance) where type Capacity = Int and type Distance = Int. 3 (1 ,1) ) ,1 (1 (1,1) (2,3) 1 2 4 Figure 2: Graph with lack of distributivity for BSP In this particular case, there are two paths from vertex 1 to vertex 4 but only one is the optimal. Let us call the path p1 the sequence of vertices {1, 2, 3, 4}, and p2 the sequence {1, 2, 4}. Assume further that p the optimal path between p1 and p2 . Since both paths have the same maximum capacity (i.e. 1), is the minimal distance that breaks the tie. So, being p1 the optimal with Cost p1 = (1, 3). Now, in terms of our pathfinding algebra for BSP, we have that D E Cost p = (1, 1) join (1, 1)join(1, 1) choose (2, 3) D E = (1, 1) (↓, +) (1, 1) (↓, +) (1, 1) (↑, ↓) (2, 3) D E = (1, 1) (↓, +) (1, 2) (↑, ↓) (2, 3) = (1, 1) (↓, +) (2, 3) = (1, 4) and this is clearly not equal to Cost p1 = (1, 3). This shows that join (the multiplication operator for BSP) does not distribute over choose (the ad- dition operator for BSP). The reasons and analysis of why this is happening is beyond the scope of this paper; the interested reader is refered to [1]. 4 Overcoming Non-Distributivity in Functional Programming 4 The solution: a simple data structure to overcome the problem As we observed from the example in Section 3, some information is left behind in the calculation of the operations multiplication and addition. Specifically, the pair (1, 2), in the inner bracket where choose is located, holds the values for the optimal solution. Since the cost of the path can be computed from a considerably large number of pairs (following the size of the graph), we do not know where the optimal pairs are located. Our goal is then to preserve all the necessary pairs that guarantee the optimal solution. That is, a list or pairs preserving the pathfinding algebras. Let us start defining which pairs should be stored in the list for BSP. This leads to the following criteria applicable to both operations join and choose: • The elements in the list (pairs of type ( Capacity , Distance )) should be sorted descending (due the maximum operator) by Capacity • In case of a tie in the Capacity , we break such tie by the second criterion, in this case the minimum operator • Pairs with equal values Capacity and Distance are stored once. More formally, given a tuple of pairs ( Capacity , Distance ), denoted as (c, d), we store the result by any of the operators of BSP as [(ci , di ), (c j , d j )], where ci > c j ∧ di > d j . Detailed criteria is given in Sect. 5 where the lexicographical order is taken into account for breaking ties. Let us put the above into work for the example in Fig. 2, for which we have Cost p1 = (1, 3) and D E Cost p = (1, 1) join (1, 1)join(1, 1) choose (2, 3) D E = (1, 1) join [(1, 2)] choose (2, 3) = (1, 1) join [(2, 3), (1, 2)] = [(1, 3)] We can notice also that our functions join and choose should be initially fed with singletons in order to preserve associativity. Therefore the type signatures are: type BSP = [ (Capacity, Distance) ] join : : BSP → BSP → BSP choose : : BSP → BSP → BSP 5 Case study: the BSP problem, breaking ties In this section we describe the algebra and operators for computing the BSP, followed by the implemen- tation in Haskell for such operators. Finally, we show why distributivity is preserved. JC Saenz, M Stannett 5 5.1 Algebra for BSP We have briefly explained the kind of graph we rely on, although the types or carrier sets can vary. For practical purpose we limit our carrier set to be N, which in Haskell can be materialise as Int provided we avoid the negative numbers. It is important to highlight that the list of pairs should be in order, so the optimal result so far is simply the head of the list. We can also notice that the potential pairs are comprised by those capacities and distances having lower values than their predecessors in the list, a relation satisfying p1 p2 . . . pn where is defined pairwise as (c1 , d1 ) (c2 , d2 ) = c1 > c2 ∧ d1 > d2 Furthermore, to include commutativity in the definition, we can denote it as (c1 , d1 ) (c2 , d2 ) = (c1 > c2 ∧ d1 > d2 ) ∨ (c1 < c2 ∧ d1 < d2 ) Now, let us formalize our algebra and operators in order to offer a solution to overcome non distribu- tivity in the BSP. Recalling the criteria in BSP, the relation RBSP is defined as follows: (c1 , d1 ) RBSP (c2 , d2 ) ≡ c1 > c2 ∨ (c1 = c2 ∧ d1 ≤ d2 ) (1) In other words, we are looking for the pair having the maximum capacity and the minimum length or distance. The product operator, ⊗ is then, (c1 , d1 ) ⊗ (c2 , d2 ) = (c1 ↓ c2 , d1 + d2 ) (2) The addition operator, ⊕ is defined as: (c1 , d1 ) ⊕ (c2 , d2 ) = (c1 ↑ c2 , d) (3) where (the lexicographic ordering) d1 if c1 > c2 d = d1 ↓ d2 if c1 = c2 d2 otherwise Also, we are assuming that ⊗ has precedence over ⊕. 5.2 Implementation of chBSP, the addition operator We start by defining the cases (or constraints) for the applicability of the chBSP operator: c1 Given two lists of pairs, of type (Capacity,Distance), chBSP will store all pairs satisfying 3. Three cases arise. Given the pairs (c_1,d_1) and (c_2,d_2) : 1. c1 = c2 , (c1.1) 2. c1 > c2 , (c1.2) 6 Overcoming Non-Distributivity in Functional Programming 3. c2 > c1 , (c1.3) Initially, comparing the heads of the lists (or singletons) is quite simple. When the pairs are part of the tails of the lists, more comparisons are needed, as carrying the length along in order to preserve distributivity. c2 For any finite-length input lists (i.e. cds1 and cds2 ), function chBSP terminates. c3 The length of the resulting list after computing chBSP , should be at most the sum of the lengths of the input lists. length chBS ≤ length cds1 + length cds2 Let us now define the function chBSP in Haskell chBSP : : [ (Capacity, Distance) ] → [ (Capacity, Distance) ] → [ (Capacity, Distance) ] chBSP [ ] cds2 = cds2 chBSP cds1 [ ] = cds1 chBSP cdcds1@( (c1, d1) : cds1) cdcds2@( (c2 , d2) : cds2) | c1 == c2 = (c1 , min d1 d2) : chAux (min d1 d2) cds1 cds2 | c1 > c2 = (c1 , d1 ) : chAux d1 cds1 cdcds2 | otherwise = (c2 , d2) : chAux d2 cdcds1 cds2 where chAux : : Distance → [ (Capacity, Distance) ] → [ (Capacity, Distance) ] → [ (Capacity, Distance) ] chAux _ [ ] [] = [] chAux d [ ] ( (c2 , d2) : cds2) = chAux’ d c2 d2 [ ] cds2 chAux d ( (c1, d1) : cds1) [ ] = chAux’ d c1 d1 cds1 [ ] chAux d cdcds1@( (c1 , d1) : cds1) cdcds2@( (c2, d2) : cds2) | c1 == c2 = chAux’ d c1 (min d1 d2) cds1 cds2 | c1 > c2 = chAux’ d c1 d1 cls1 cdcds2 | otherwise = chAux’ d c2 d2 cdcds1 cds2 chAux’ : : Distance → Capacity → Distance → [ (Capacity, Distance) ] → [ (Capacity, Distance) ] → [ (Capacity, Distance) ] chAux’ d c’ d’ cds1 cds2 | d > d’ = (c’ , d’ ) : chAux d’ cds1 cds2 | otherwise = chAux d cds1 cds2 5.3 Implementation of jnBSP, the product operator Similar to chBSP , there are some requirements for jnBSP to be fulfilled. Since there is no ‘tie’ breaker in our equation as in the addition operation, we can then apply the operation in two comparisons rather than three, but looking after the preservation of distributivity. We avoid any repeated pair to be stored along one traversal in order to minimise the length of the resulting list. c1 Given two lists of pairs, of type (Capacity, Distance), jnBSP will store all pairs satisfying distribu- tivity, in the context of an multiplication computation. Two cases arise. Given the pairs (c1 , d1 ) and (c2 , d2 ) : 1. c1 ≤ c2 , (c1.1) 2. c1 > c2 , (c1.2) Same as chBSP , we initially compare the heads of the lists (or singletons). When the pairs are part of the tails of the lists, more comparisons are needed. JC Saenz, M Stannett 7 c2 For any finite-length input lists (i.e. cds1 and cds2), function jnBSP terminates. c3 The length of the resulting list after computing jnBPS , should be at most the sum of the lengths of the input lists. length jnBPS ≤ length cds1 + length cds2 Haskell implementation for jnBPS is intentionally omitted due limited space for submission 5.4 jnBSP does distribute over chBSP Let us consider the following ordering to show that jnBSP does actually distribute over chBSP: c1 < c2 < c3 and d1 < d2 < d3 in the pairs (c1 , d1 ), (c2 , d2 ), (c3 , d3 ) Then, applying the operators we have: [(c1 , d1 )] jnBSP [(c2 , d2 )] chBSP [(c3 , d3 )] = { applying chBSP, and because c3 > c2 ∧ d2 < d3 } [(c1 , d1 )] jnBSP [(c3 , d3 ), (c2 , d2 )] = { applying jnBSP and since d1 + d2 < d1 + d3 } [(c1 , d1 + d2 )] Now, on the right-hand side [(c1 , d1 )] jnBSP [(c2 , d2 )] chBSP [(c1 , d1 )] jnBSP [(c3 , d3 )] = { applying jnBSP’s } [(c1 , d1 + d2 )]chBSP[(c1 , d1 + d3 )] = { applying chBSP and d1 + d2 < d1 + d3 } [(c1 , d1 + d2 )] 5.5 Algorithm for BSP As we stated before, our approach relies on a directed acyclic graph, so the maximum number of edges is v(v−1) 2 , where v is the number of vertices. The trivial step is the one having two vertices and therefore only one edge. Assume the capacities set is C = N and distances set is D = N ∪ {∞} s t (cst , dst ) and its adjacency matrix s t s (0, ∞) (cst , dst ) t (0, ∞) (0, ∞) 8 Overcoming Non-Distributivity in Functional Programming where (0, ∞) is the zero and (∞, 0) is the one (or unit) elements of C × D, meaning there is no path between two vertices. Another way to view the current solution for bigger graphs, is through recursion. So, having the following graph (cst , dst ) s x t (csx , dsx ) (cxt , dxt ) we can start our computations from the target vertex t backwards, and storing partial solutions. Let us say that solx , reading as solution at x, refers to x t [(cxt , dxt )] And therefore jnBSP [(cst , dst )] [(∞, 0)] s t jnBSP [(csx , dsx )] solx Let us depict a final example of a fully connected graph to describe the complete situation for the single source approach. (c15 , d15 ) (c14 , d14 ) (c13 , d13 ) (c35 , d35 ) (c12 , d12 ) (c23 , d23 ) (c34 , d34 ) (c45 , d45 ) 1 2 3 4 5 (c24 , d24 ) (c25 , d25 ) Figure 3: Reusing previous computations The proposal solution for computing BSP problem from vertex 1 to vertex 5 is: JC Saenz, M Stannett 9 double arrow single arrow (sol2 ) dashed arrow (sol3 ) chBSP chBSP chBSP jnBSP (c15 , d15 ) sol5 jnBSP (c25 , d25 ) sol5 jnBSP (c35 , d35 ) sol5 jnBSP (c14 , d14 ) sol4 jnBSP (c24 , d24 ) sol4 jnBSP (c34 , d34 ) sol4 jnBSP (c13 , d13 ) sol3 jnBSP (c23 , d23 ) sol3 jnBSP (c12 , d12 ) sol2 where sol4 , dotted arrow, is the pair (c45 , d45 ) and sol5 , not arrow at all, is the pair (∞, 0), the unit element. The above table shows that jnBSP is performed as expected whereas each function chBSP computes a list of joins, we will call it chooses in the following code. Finally, we can store the results from each soli in (descending) order to avoid repeated computations. We will do so in an unidimensional array, where i in soli represents the vertex i. Let us call our single-source approach function to solve BSP problem as solutionSS. solutionSS : : Int → Int → Int → [ (Capacity, Distance) ] solutionSS v e seed = sol ! 1 where sol = array (1 ,v) ( [ ( v , oneNonPath) ] ++ [ ( i , chooses (randomweights ! ! (i−1))) | i ← [v−1,v−2..1] ] ) randomgraph = elems $ randomGraphFilled v e seed randomlist = zip (randomList 20 seed) (randomList 50 (seed+1)) randomweights = mixListsPairs randomgraph randomlist chooses : : [ (Int, Int, Int) ] → [ (Capacity, Distance) ] chooses [ ] = zeroNonPath chooses (x:xs) = nch (njn [ (snd3 x , trd3 x) ] (sol ! (fst3 x) ) ) ( chooses xs ) This code practically follow the definition in the above graphs and tables, where sol is an array holding partial solutions starting from solt ( oneNonPath or [(∞, 0)], the unit of multiplication). The computation starts with the target vertex and then we compute further solutions backwards. Function chooses computes chBSP’s as far as its input list has pairs of random values, otherwise it returns [(0, ∞)] which is the unit of addition. This function, solutionSS , always terminates since v is a finite number of vertices and the internal list [v − 1, v − 2 . . . , 1] bounds its size. Function randomweights returns a triple with the index i representing the current vertex and random values for a Capacity and a Distance . If, after a random graph is generated, there is not transitivity (path connection) from the start vertex s to the target vertex t, then the zero or [(∞, 0)] is returned as solution by solutionSS . 6 Test cases and benchmarking Throughout this document we have defined functions in order to solve specific problems in the pathfind- ing context. Those functions are implemented as purely functional and lazy-evaluated. Another way to compute such functions with the same results is through eager evaluation. Secondly important for this section is the treatment of path tracking, which turns to be the most noticeable difference in performance between the above evaluations. So far, jnBSP and chBSP functions cover only the computation of the cost of BSP. This section comprises different implementation approaches. For each case we will give the perfor- mance, that is, times, memory allocation, and the percentage of garbage collection used, bearing in mind 10 Overcoming Non-Distributivity in Functional Programming that we tried every execution on a processor 2.2 GHz Intel Core i7. Each measurement is the mean of three runs, recording the time taken. Profiling was used only to determine the performance of different functions on each program. Then a separate run was done without profiling, so that the overheads of profiling do not have a bearing on the results. 6.1 Eager vs lazy evaluation One important issue about functional programming languages, specifically for those pure functional, is the lazy evaluation. This mechanism allows an expression to be evaluated only when it is needed. So far the functions jnBSP and chBSP are lazy, that is, they are not evaluated in advance to verify whether or not a ⊥ is present. However, the difference in time and space consumption is minimum for the cost (not the path) between evaluations eager and lazy, since their operators are, by nature, strict, as we shown above. In order to turn choose and join strict, we appeal to the library Control.DeepSeq, which manages the function deepseq, instead of applying the function seq provided in the native library Prelude, since the former traverses data structures deeply. We have defined our solution by a pair as data structure for the cost. So, we have mkStrictPair x y = let xy = (x , y) in deepseq xy xy 6.2 Path tracking There is a general type signature for both BSP and knapsack problems, solution : : ( ( x , y) ,Path) where (x, y) represents either the cost (Capacity,Distance). At the end, both problems have the following structure solution : : ( (Int, Int) , [Int] ) Since the path is a list (of integers), the operator ++ or append is called in order to build the complete path along the graph traversal. Such operator performs slower in eager evaluation respect to the lazy one, mainly because the lazy evaluation does not force append to carry out until the very end, computing less lists to be appended. Now, to turn the path tracking strict, we have mkStrictTriple x y z = let xyz = (x , y , z) in deepseq xyz xyz where z is defined of type type Path = [Int] The following are the corresponding data tables and graphics of the single-source approach. JC Saenz, M Stannett 11 12 Overcoming Non-Distributivity in Functional Programming 6.3 All pairs approach We have solved the BSP problem, with a dynamic programming approach, but only for cases where graphs are acyclic. For the rest types of graphs, we appeal to Floyd-Roy-Warshall algorithm (FRW). This algorithm assumes that the operations over a square matrix, modelling the graph, are associative and that the multiplication distributes over addition. We can now, by the application of chBSP and jnBSP, hold this property. In this section we will describe and adapt FRW into an algorithm that solves the BSP problem in the setting of a functional programming, calling this function f rw. We will define two versions for this purpose, a purely functional and a stateful one (monadic version). Details on implementations of FRW are not the aim of this thesis but those regarding our functions chBSP and jnBSP are. Same as in single source implementation, we will show the benchmarking for the performance on both versions of all pairs approach. We also saw that the closure operator is part of the Warshall-Floyd-Kleene algorithm. For the pur- poses of BSP problem, we have that, for all x ∈ S, where S is the carrier set of the this problem, x∗ = 0 ↑ x ↑ (x ↓ x) ↑ . . . ↑ x∗ = x, for the MC counterpart where elements and operators are (↑, ↓, 0, ∞). On the other hand, we have that x∗ = ∞ ↓ x ↓ (x + x) ↓ . . . ↓ x∗ = x, for the SP counterpart where elements and operators are (↓, +, ∞, 0). JC Saenz, M Stannett 13 6.4 Pure functional version We start with a general perspective; the function f rw takes a graph (in its matrix representation) and returns a graph (matrix as well) with the solutions. frw : : GraphFRW → GraphFRW frw g = foldr induct g (range (limitsG g) ) where GraphFRW is type synonym of: type Pair = [ (Capacity, Length) ] type GraphFRW = Array Edge Pair From the vast variety of ways to write a solution to FRW, we picked up the function foldr. Since foldr, from the Prelude library, traverse a data structure given an initial value and computing each element of that data structure with given a function, we just substitute the types of foldr by 1. induct as the function to be applied to every element of the data structure 2. g a graph, in this case our original adjacency matrix. 3. a list of vertices, obtained from range (limitsG g) . The induct function will compute the k-th matrix, or k-th path. That is, for every element in the list of vertices in range (limitsG g) , a new matrix is created through mapG , which is called from induct k g . induct : : Vertex → GraphFRW → GraphFRW induct k g = mapG (const . update) g where update : : Edge → Pair update (x , y) = chBSP (g! ( x , y) ) (jnBSP (g! ( x , k) ) (g! ( k , y) ) ) mapG : : Ix a ⇒ (a → b → c) → Array a b → Array a c mapG f a = array (bounds a) [ (i , f i (a!i) ) | i ← indices a ] 6.5 Monadic version This is perhaps a more straightforward implementation from the original FRW algorithm. Given n ver- tices and a graph xs, function frw computes FRW through the function runST. It returns, similar to the pure functional frw, a graph with the solutions (i.e. all pairs). The graph here is also another way to refer to a square matrix, which in this case is mutable. Code omitted due limited space for submission where function compute is the core of our interest since it performs the the addition and multiplication operations in order to solve BSP problem. compute : : MyA s → MyA s → Int → Int → Int → ST s ( ) compute xa xb k i j = do ij ← readArray xa (i , j) ik ← readArray xa (i , k) kj ← readArray xa (k , j) writeArray xb (i , j) (chBSP ij (jnBSP ik kj) ) 14 Overcoming Non-Distributivity in Functional Programming The following chart and tables are calculated with non path-tracking and computing the just the index (1, v) of the square matrix in order to avoid the display of the whole matrix. Due the huge amount of memory taken by the random generation of pairs (labels on the graphs), we profiled specifically the functions frwF, frwFS, frwM, and frwMS for pure functional and monadic implementations of the FRW algorithm. Then we include the random graph by reading it from a text file. Here is a sample of profiling the single source approach: COST CENTRE MODULE % time % alloc. randomList Graphs 51.7 51.7 listST.constST Graphs 18.5 10.8 mapST Graphs 14.0 17.3 applyST Graphs 5.8 6.1 randomPerm.swapArray Graphs 5.1 3.0 randomPerm Graphs 3.9 10.8 solutionSS Main 1.0 0.3 Table 1: Profiling sspt, single source with path-tracking JC Saenz, M Stannett 15 Figure 4: Pure functional and monadic implementations for the all-pairs approach The main difference between monadic and pure- f unctional is that the former updates the informa- tion over the same data structure (i.e. array) where the latter generates a new one each time is needed. This gives the advantage to the monadic version not only in the speed side, but in the number of prob- lems it can handle over the same computer. See in figure 4, that the pure-functional version was able to compute up to 400 vertices whereas the monadic counterpart did it up to the double, and in much less time. 16 Overcoming Non-Distributivity in Functional Programming 7 Acknowledgements JCS would like to thank Henrik Nilsson for his invaluable insights during the early stages of this work. References [1] Roland Backhouse (2006): Regular algebra applied to language problems. The Journal of Logic and Algebraic Programming 66(2), pp. 71–111. [2] Refael Hassin, Jérôme Monnot & Danny Segev (2010): The complexity of bottleneck labeled graph problems. Algorithmica 58(2), pp. 245–262. [3] David J. King & John Launchbury (1995): Structuring Depth-first Search Algorithms in Haskell. In: Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Program- ming Languages, POPL ’95, ACM, New York, NY, USA, pp. 344–354, doi:10.1145/199448.199530. Available at http://www.researchgate.net/publication/2252048_Structuring_Depth-First_ Search_Algorithms_in_Haskell. [4] Hackage Listing (Accessed 17 May 2017): Data.Graph. Available at http://hackage.haskell.org/ package/fgl-5.5.3.1/docs/Data-Graph.html. Based on [3].