Funflow is a workflow management tool. It turns out that workflow management tools and build tools are closely related. So if you’re more familiar with the latter, this post might be of interest to you. We’ll be illustrating some of Funflow’s features by implementing Make on top.
All the code for this example as well as some documentation can be found here.
Today’s example: Make
Our example is a simple version of GNU’s Make restricted to building C files (though it could be generalized). It takes a makefile that looks like this
source-files: main.cpp factorial.cpp hello.cpp functions.h hello: main.o factorial.o hello.o functions.h g++ main.o factorial.o hello.o -o hello main.o: main.cpp functions.h g++ -c main.cpp factorial.o: factorial.cpp functions.h g++ -c factorial.cpp hello.o: hello.cpp functions.h g++ -c hello.cpp
and can build the
$ ls factorial.cpp functions.h hello.cpp main.cpp makefile $ stack exec makefile-tool $ ./hello Hello World! The factorial of 5 is 120
Because we used Funflow, our tool has several desirable properties, both of the tool and of the code itself:
- No repeated builds: If we’ve built something before, we don’t build it again. Period.
make, if you make a tiny change and find out it isn’t what you want, when you revert back
makewill rebuild something it had built before. Our tool won’t.
- Enforced target dependencies: We build each target file in a docker container
with exactly the dependencies listed in the
Makefile. There’s no opportunity for hidden dependencies or hidden requirements on the environment. Further, such ‘hidden’ preconditions are caught early and flagged making it easy to fix the
- Clean Sequencing Code: The function that makes a target file sequences
file processing, recursive calls for making the dependencies of the given target,
and running docker containers. Usually, this sequencing is messy
and difficult to follow.
With Funflow, however, we can inject these various forms of
Flows and sequence them seamlessly with arrow notation.
- Concise Code: Because of the abstractions Funflow provides, we can focus on the essential algorithm and get some efficiency & safety for free.
No Repeat Builds: CAS Gives Us Free Dynamic Programming
The essential function is
buildTarget which, given a
and a specific rule in that
Makefile, creates a flow to build the
target specified by that rule. A rule specifies a target file by
a list of file dependencies and a command to build that target file.
Each dependency is either a source file or itself a target file.
Here is a slightly simplified version of that function:
-- For readability, we introduce a type alias for Flow type a ==> b = Flow a b buildTarget :: MakeFile -> MakeRule -> (() ==> (Content File)) buildTarget mkfile target@(MakeRule targetNm deps cmd) = proc _ -> do -- 1) Get source file contents contentSrcFiles <- grabSrcsFlow -< neededSources -- 2) Zip these with the names of each source file let fullSrcFiles = Map.fromList $ zip neededSources contentSrcFiles -- 3) Recursively build all dependent targets depFiles <- flowJoin depTargetFlows -< (replicate countDepFlows ()) -- 4) Compile this file given -- a) the name of the target -- b) the dependencies that are source files -- c) the dependencies that were targets -- d) the gcc command to execute compiledFile <- compileFile -< (targetNm, fullSrcFiles, depFiles,cmd) returnA -< compiledFile where srcfiles = sourceFiles mkfile neededTargets = Set.toList $ deps `Set.difference` srcfiles neededSources = Set.toList $ deps `Set.intersection` srcfiles depRules = (findRules mkfile neededTargets :: [MakeRule]) depTargetFlows = map (buildTarget mkfile) depRules countDepFlows = length depTargetFlows grabSources srcs = traverse (readFile . ("./" ++)) srcs grabSrcsFlow = stepIO grabSources
The code for building the flow is straightforward.
First, we do some file processing to get the dependent source files
(in steps 1 & 2).
Then, we recursively call
buildTarget to create flows for each of the
dependent target files and then run those flows to build the
dependent targets (in step 3). Finally, we put these dependencies
in a docker container in which we run the gcc command and extract the
produced target file (provided the compilation succeeded).
On the surface, this code appears inefficient.
Step 3, the recursive building of dependent target files, looks like it repeats
a lot of work since many of the recursive calls are identical.
For example, if the rule for
main.o in our sample
factorial.o as a dependency, it looks like this code
factorial.o twice -once as a dependency of
hello and once as
a dependency of
factorial.o took a long time to build,
this would be disastrous.
Yet, this code isn’t inefficient? Why?
This problem of ‘overlapping’ recursive calls is solved by dynamic programming,
the algorithmic design pattern of saving the values of function calls in a dictionary and
checking this dictionary to avoid repeated recursive calls.
Funflow’s caching gives this to us for free. Hence, we can write
DP-algorithms without dealing with the added complexity
of keeping track of the dictionary ourselves. This is precisely what happens in
Each time a file is compiled with
a hash of the inputs and output is cached. So, on a repeated
recursive call building say,
target4.o, the use of
is constant time.
However, this goes a step further.
Because of Funflow’s caching our tool, unlike GNU’s
make, doesn’t re-build
targets even after reverting back changes.
factorial.cpp took a long time to build and was part of
a larger project. Suppose further that to fix a bug in this large project,
we tried something out in
factorial.cpp and found out it didn’t work.
When we revert back, our tool would build
factorial.cpp in constant time
whereas plain ol’
make would not.