Start / Functional Design in Clojure / Ep 032 call me lazy

Ep 032: Call Me Lazy

27 min • 7 juni 2019

Christoph finds map doesn't let him be lazy enough.

Last week, we were dealing with multi-line sprinkle errors.
We were able to get more context using partition.
(01:33) Problem: the component lines had to be adjacent.
Solution last week was to create larger partitions to hopefully get the rest of the error.
This became a magic number problem, guessing how far we had to look ahead.
"If there's anything I've learned in my career, telling the future is one of the hardest things to do.'
What number should be big enough? 100? 1000?
(04:00) The other problem is that the function is handed a pre-selected set of lines.
The decision about how many lines is appropriate is made outside the function.
Wouldn't it be nice if the function had control over how far to look ahead.
"The function can't function."
"Functions are all we've got in functional programming. Well, that and lists."
It would be great if the function itself could take a sequence and look as far as it needs to.
How about handing the function the entire lazy sequence?
(05:52) Problem: Handing in the entire sequence means we can't use map to convert lines into sprinkle errors anymore.
We can write a function that gives us just one sprinkle error from the sequence, but we want to convert the sequence into a sequence of all the sprinkle errors.
We're going from something that operates on a subset of the sequence to something that operates on the entire sequence, which is too much control.
We need a way for it to look ahead but still
It's no longer just working on a chunk of the sequence, but on the unbounded sequence itself.
We need to elevate it to the same power as other sequence operators, like map and filter.
We don't, however, want the function to eagerly find all sprinkle errors in the sequence. It needs to be lazy.
(09:26) Solution 1: How can we just get one sprinkle error out?
1. If the first line isn't the error start, recur with the tail until found.
2. Do a take-while to find the second half of the error.
3. When both found, return the value.
We need to terminate the search if we hit the end of the sequence, so we only continue if (seq lines) is not nil.
"There's no sense in looking in an empty bucket."
But we don't want just one, we want the entire sequence.
It would be really nice to return the value when we find it and then wait to find the next one until it is requested.
Conceptually, we could tell the calling function an index of where to start looking for the next error.
(13:44) Solution: In Clojure, we keep our place using the lazy-seq function.
lazy-seq is a sequence, but it hasn't been realized yet.
It's like being able to hand back a value and a function to call for the next value.
When you find a value, you can cons it onto the head of an invocation of lazy-seq to make a new sequence.
Step 1. Wrap your entire function body in lazy-seq.
This is similar to using delay, because it wraps the code in something that will only be evaluated when it is first accessed.
Step 2. Ensure that the body obeys the contract. It must return either:
- nil, which indicates that the sequence is complete.
- a sequence, usually constructed by calling cons on a value and a call to lazy-seq.
Top of the body is a call to (when (seq lines) ..., to ensure that the sequence terminates when there is no data left.
Since the top of our function is lazy-seq, we can cons the found value onto a recursive call to the function.
In the recursive call, we must pass the next section of the sequence, so that when evaluated it will pick up at the right place.
If we don't find the start of the error, we recurse with the rest of the sequence to try parsing from there.
This function will go through the sequence eagerly until it finds something.
Instead of operating on single elements in the sequence, we can take a sequence and produce a sequence, powered by lazy-seq.
With this capability, you can build a higher level sequence that consumes this sequence and produces a new summary, all done lazily.

Related episodes:

Clojure in this episode:

partition
seq, cons, rest
lazy-seq, delay
map, filter, take-while
recur

Code sample from this episode:

(ns devops.week-04
  (:require
    [devops.week-01 :refer [parse-line]]
    [devops.week-02 :refer [process-log]]
    [devops.week-03 :refer [sprinkle-errors-by-type]]
    ))

(defn sprinkle-error-seq
  [lines]
  (lazy-seq
    (when (seq lines)
      (let [[first-line second-line & tail] lines
            [_whole donut-id] (some->> first-line :log/message (re-matches #"failed to add sprinkle to donut (\d+)"))
            [_whole error] (some->> second-line :log/message (re-matches #"sprinkle fail reason: (.*)"))]
        (if (and donut-id error)
          (cons (merge first-line
                       {:kind :sprinkle
                        :sprinkle/donut-id donut-id
                        :sprinkle/error error})
                (sprinkle-error-seq tail))
          (sprinkle-error-seq (next lines)))))))


(comment
  (process-log "sample.log" #(->> % (map parse-line) sprinkle-error-seq doall))
  (process-log "sample.log" #(->> % (map parse-line) sprinkle-error-seq sprinkle-errors-by-type))
  )

Kategorier

Poddar Så gör man Teknologi Utbildning

Förekommer på

Teknik

00:00 -00:00