Sveriges mest populära poddar

Functional Design in Clojure

Ep 032: Call Me Lazy

27 min • 7 juni 2019

Christoph finds map doesn't let him be lazy enough.

  • Last week, we were dealing with multi-line sprinkle errors.
  • We were able to get more context using partition.
  • (01:33) Problem: the component lines had to be adjacent.
  • Solution last week was to create larger partitions to hopefully get the rest of the error.
  • This became a magic number problem, guessing how far we had to look ahead.
  • "If there's anything I've learned in my career, telling the future is one of the hardest things to do.'
  • What number should be big enough? 100? 1000?
  • (04:00) The other problem is that the function is handed a pre-selected set of lines.
  • The decision about how many lines is appropriate is made outside the function.
  • Wouldn't it be nice if the function had control over how far to look ahead.
  • "The function can't function."
  • "Functions are all we've got in functional programming. Well, that and lists."
  • It would be great if the function itself could take a sequence and look as far as it needs to.
  • How about handing the function the entire lazy sequence?
  • (05:52) Problem: Handing in the entire sequence means we can't use map to convert lines into sprinkle errors anymore.
  • We can write a function that gives us just one sprinkle error from the sequence, but we want to convert the sequence into a sequence of all the sprinkle errors.
  • We're going from something that operates on a subset of the sequence to something that operates on the entire sequence, which is too much control.
  • We need a way for it to look ahead but still
  • It's no longer just working on a chunk of the sequence, but on the unbounded sequence itself.
  • We need to elevate it to the same power as other sequence operators, like map and filter.
  • We don't, however, want the function to eagerly find all sprinkle errors in the sequence. It needs to be lazy.
  • (09:26) Solution 1: How can we just get one sprinkle error out?
    1. If the first line isn't the error start, recur with the tail until found.
    2. Do a take-while to find the second half of the error.
    3. When both found, return the value.
  • We need to terminate the search if we hit the end of the sequence, so we only continue if (seq lines) is not nil.
  • "There's no sense in looking in an empty bucket."
  • But we don't want just one, we want the entire sequence.
  • It would be really nice to return the value when we find it and then wait to find the next one until it is requested.
  • Conceptually, we could tell the calling function an index of where to start looking for the next error.
  • (13:44) Solution: In Clojure, we keep our place using the lazy-seq function.
  • lazy-seq is a sequence, but it hasn't been realized yet.
  • It's like being able to hand back a value and a function to call for the next value.
  • When you find a value, you can cons it onto the head of an invocation of lazy-seq to make a new sequence.
  • Step 1. Wrap your entire function body in lazy-seq.
  • This is similar to using delay, because it wraps the code in something that will only be evaluated when it is first accessed.
  • Step 2. Ensure that the body obeys the contract. It must return either:
    • nil, which indicates that the sequence is complete.
    • a sequence, usually constructed by calling cons on a value and a call to lazy-seq.
  • Top of the body is a call to (when (seq lines) ..., to ensure that the sequence terminates when there is no data left.
  • Since the top of our function is lazy-seq, we can cons the found value onto a recursive call to the function.
  • In the recursive call, we must pass the next section of the sequence, so that when evaluated it will pick up at the right place.
  • If we don't find the start of the error, we recurse with the rest of the sequence to try parsing from there.
  • This function will go through the sequence eagerly until it finds something.
  • Instead of operating on single elements in the sequence, we can take a sequence and produce a sequence, powered by lazy-seq.
  • With this capability, you can build a higher level sequence that consumes this sequence and produces a new summary, all done lazily.

Related episodes:

Clojure in this episode:

  • partition
  • seq, cons, rest
  • lazy-seq, delay
  • map, filter, take-while
  • recur

Code sample from this episode:

(ns devops.week-04
  (:require
    [devops.week-01 :refer [parse-line]]
    [devops.week-02 :refer [process-log]]
    [devops.week-03 :refer [sprinkle-errors-by-type]]
    ))

(defn sprinkle-error-seq
  [lines]
  (lazy-seq
    (when (seq lines)
      (let [[first-line second-line & tail] lines
            [_whole donut-id] (some->> first-line :log/message (re-matches #"failed to add sprinkle to donut (\d+)"))
            [_whole error] (some->> second-line :log/message (re-matches #"sprinkle fail reason: (.*)"))]
        (if (and donut-id error)
          (cons (merge first-line
                       {:kind :sprinkle
                        :sprinkle/donut-id donut-id
                        :sprinkle/error error})
                (sprinkle-error-seq tail))
          (sprinkle-error-seq (next lines)))))))


(comment
  (process-log "sample.log" #(->> % (map parse-line) sprinkle-error-seq doall))
  (process-log "sample.log" #(->> % (map parse-line) sprinkle-error-seq sprinkle-errors-by-type))
  )
Förekommer på
00:00 -00:00