Sveriges mest populära poddar

Functional Design in Clojure

Ep 035: Lifted Learnings

27 min • 28 juni 2019

Christoph and Nate lift concepts from the raw log-parsing series.

  • Reflecting on the lessons learned in the log series.
  • (01:15) Concept 1: We found Clojure to be useful for devops.
    • Everything is a web application these days,
    • "The only UIs in Devops are dashboards."
    • For most of the series, our UI was our connected editor.
    • We grabbed a chunk of the log file and were fiddling with the data in short order.
    • We talk about connected editors in our REPL series, starting with Episode 12.
    • Being able to iteratively work on the log parsing functions in our editor was key to exploring the data in the log files.
  • (04:04) Concept 2: Taking a lazy approach is essential when working with a large data set.
    • Lazily going through a sequence is reminiscent of database cursors. You are at some point in a stream of data.
    • We ran into some initial downsides.
    • When using with-open, fully lazy processing results in an I/O error, because the file has been closed already.
    • Shouldn't be too eager too early, because then the entire dataset will reside in memory.
    • Two kinds of functions: lazy and eager.
      • Lazy functions only take from a sequence as they need more values.
      • Eager functions consume the whole sequence before returning.
    • Ensure that only the last function in the processing chain is eager.
    • "It only takes one eager to get everybody unlazy."
  • (08:38) Concept 3: Clojure helps you make your own lazy sequences using lazy-seq.
    • Clojure has a deep library of functions for making and processing lazy sequences.
    • We were able to make our own lazy sequences that could then be used with those functions.
    • Wrap the body in lazy-seq and return either nil (to indicate the end) or a sequence created by calling cons on a real value and a recursive call to itself.
  • (12:41) Concept 4: We work with information at different levels, and that forms an information hierarchy.
    • The data goes from bits to characters to lines, and then we get involved.
    • We move from lines on up to more meaningful entities. Parsed lines are maps that have richer information, and then errors are richer still.
    • Our parsers take a sequence and emit a new sequence that is at a higher level of information.
    • We first explored this concept in the Time series.
    • The transformations from one level to the next are all pure.
  • (14:53) Concept 5: Sometimes you have to go down before you can go up again another way.
    • We pre-abstracted a little bit, and only accepted lines that had all of the data we were looking for (time, log level, etc.).
    • Exceptions broke that abstraction, so we reworked our "parsed line" map to make the missing keys optional.
  • (15:54) Concept 6: Maps are flexible bags of dimensions. They are a set of attributes rather than a series of rigid slots that must be filled.
    • Functions only need to look at the parts of the map that they need.
    • Every time we amplify the data, we add a new set of dimensions.
    • Thanks to namespacing, all of these dimensions coexist peacefully.
    • Multiple levels of dimensions give you more to filter/map/reduce on.
    • Just because you distill, doesn't mean you want to lose essence.
  • (21:09) Concept 7: Operating within a level of information is a different concern than lifting up to a higher level of information.
    • Within a level, functions aid in filtering and aggregating.
    • Between levels, functions recognize patterns and groupings to produce higher levels of information.
    • Make the purpose of the function clear in how you name it.
    • Separate functions that "lift" the data from functions that operate at the same level of information.
    • When exploring data, you don't know where it will lead, so start by moving the data up a level in small steps.

Related episodes:

Clojure in this episode:

  • lazy-seq, cons
  • with-open
Förekommer på
00:00 -00:00