# F# Data Structures: Set

As I’ve been going through Okasaki’s book on functional data structures, it occurred to me that it’d also be a lot of fun to run through the F# implementation of it’s core data structures and compare them with the book Purely Functional Data Structures.

It’s a lot of fun seeing how fundamental and still powerful the data structures used in Purely Functional Data Structures are and has made reading through this book even more satisfying.

The first significant data structure in Okasaki is the Set, which is implemented using a binary search tree. It’s simple enough to find the F# Core implementation of set in the GitHub repository.

# Purely Functional Data Structures: Chapter 2 - Binary Search Trees

The second section of Chapter 2 deals focuses on binary search trees (BST) and uses BSTs to implement a set data structure. Like the section on lists, this gives a nice little explanation on how to deal with inserts to a tree data structure when the data is immutable in as efficient a way as possible. The most brute force solution to this would be to create a complete duplicate of the tree save for the newly added value, this section shows that, while copying is necessary, only the search path from the root to the new element needs to be copied. The new tree will then reference as much of the old tree as possible; minimizing both copies and memory used.

What this post will cover: how to represent a binary search tree (BST) in F#, how to efficiently update the BST while using immutable data structures, and some discussion on design and style.

# Purely Functional Data Structures: Chapter 2 - Lists

Chapter 2 covers the efficient implementation of stack and binary search trees when your data structures must be immutable. This chapter wound up being a lot of fun to work on because it helped me focus a lot on how to approach structuring my F# code and start laying the foundation for how to approach building elegant F# code.

# Pattern Matching With Pipes

One of my favorite things to do in F# is pipe functions together. I like the elegant flow that the semantics visualizes and the fact that it removes the need for intermediate variables. Given that, one of the minor, but consistent, annoyances I’ve had is when piping a DU type which generally need to be piped into a pattern match expression. This is annoying because the match expression doesn’t lend itself to piping which has meant that my nice workflow needs to be broken up with an intermediate variable that can be used in the match expression.

Looking at some old code I wrote, I just realized something that is probably pretty obvious to every experienced F# programmer out there. The function expression actually does exactly what match does but it creates a function and that’s exactly what I’ve been wanting this whole time!

# Zippers 2: Building a Rose Tree Zipper

As a follow up to my post about Zippers for lists and binary trees, I wanted to create a zipper for a slightly more complex data structure. The Rose Tree is a tree structure where the number of branches a node may have is variable. An example of a rose tree would be the directory structure on your computer: each directory may contain 0 or more sub directories which, in turn, may contain addition subdirectories. With this example in mind, the zipper is analagous to you moving through your computers file system: starting at the root directory and using cd to move down a branch and cd .. to move back.

# F# Zippers

One of the coolest things about working with F# (and other ML languages) is the incredibly elegant way that mathematics intersects with programming, to inform powerful tools for our toolbox. Algebraic Data Types (ADTs) are the source of a large amount of this mathematical invention. Recently, I’ve been exploring the algebra and calculus of types and what happens when you take the derivative of a type.

Zippers are a type pattern which provide a functional way to interact with and transform data structures: linked lists, binary trees, rose trees, etc. The Zipper is a type with a set of functions that create a cursor which moved through a data structure, much like you move through your computers file directory tree, and can be used to modify the data structure. They’re also a really cool demonstration of the intersection between programming and higher math, in this case: derivatives.

In this blog post, I will explain both the mechanics of the Zipper type and the mathematics of the Zipper. First, I’ll explain the list Zipper: how to make one in F# and what can be done with it. Second, the list zipper will be used as the basis for teaching how to take the derivative of an Algebraic Data Type (ADT). Finally, the derivative operation will be usd to create a Zipper for the binary tree.

# Fundamental Theorem of Prog Rock

Given $B$, a band, and $S$, the set of all songs composed by $B$ then $B$ is Prog iff $\exists s \in S$ such that $s$ contains the sound of a baby.

Testing out MathJax and Jekyll.

# A Cartoon Map of Knowledge

While visiting San Diego, for some much needed time off from work, I had a fantastic evening hanging out my friend Steve Bjorg and his family. As the bookend to the evening, Steve and I sat down and had chatted about reading books. It started as merely a random topic on which to build an enjoyable conversation, but quickly morphed into a discussion about how to represent knowledge.

The driver behind our discussion was an exploration of my inability to read technical books on a Kindle or computer. The reason I dislike reading technical books on devices is because they don’t allow me to do my prefered method of navigation. When I’m reading a technical book, there are two possible reasons that are making me read: 1. to solve a specific problem that I am having or 2. to find something which sparks my interest and passion and causes me to explore and experiment. For example, I’ve been learning Clojure, my main resource for that as been one of the O’Reilly books on Clojure. I have the Clojure basics down, so now what I do is flip through the book quickly until I see a diagram, image, or something which catches my attention, at which point I stop and start reading the context. So the question Steve and I began exploring was: can you represent the contents of a large complex technical manual, online, in a way which allows a reader to guide themselves through the entire subject matter in a way which only requires a glance to find what is important.

The problem with doing this sequential flip through of online documentation is that online documenation is not sequential; it is a heirarchical tree or maybe even a directed graph. Each section linking to children which dive in to more detail about each topic covered in the parent section or linking over to related topics. When your book is a choose your own adventure tree, how can you flip through it sequentially while still keeping context? Our first idea was to make this tree a sequence by enumerating it with a breadth first search. Our guess here is that a breadth first search will keep some level of connection between parent and children topics and allow the reader to build an over laying context to keep things reasonable.

My own personal vision of what this would become is something like a cartoon map of an amusement park. Ideally, a wordless, visual guide which represents the broad strokes of what is in the manual while cartooonishly exagerating the features of interest. In other words, and adventurer’s map: pointing out where there be dragons, and trolls, and krakens, and dungeons. So a user can look at it, find something which looks fascinating instantly, and diving into the full granular details and context to learn.

This is all about providing a way for someone who already knows the basics of a topic to keep themselves diving further and further into the ocean. Basically, it’s a carrot to dangle in front of people to motivate them to learn your software by showing them adventures they can have.

# It’s Been 8 Months

It has been quite a long time since my last blog post. Quite a lot has happened in that time. I moved from San Diego to Chicago. I got a new job and now the CTO for a small start up company out in Chicago. I have switched off of Windows development to Linux development. I have gone though doing my work on a Windows box, to a Linux box, and now I’m on a MacBook. I have been doing a lot of learning about Cassandra, Python, Clojure, and Spark. I have been learning about data science and machine learning.

It’s been an exciting and challenging year for me.

I also realized that I miss writing posts (though I only did a few) which help me organize my thoughts and enforce new things which I have learned.

So, now I’ll be kicking that learning off again.

# Type Providers Tutorial Part 4 - Base Types

In Part 3 of this Tutorial, I talked a little about erased types and how the types we generate are actually built on top of a obj. Previously, I just used a plan obj type and cast an integer to and from the obj types.

Using just the basic obj works well enough for a very simple generated type (like the simple integer from part 3). However, it becomes a bit of a mess when you want to make anything complicated.

In this part, we will update our Type Provider to use a more advanced type as our base type.