Sveriges mest populära poddar

Functional Design in Clojure

Ep 020: Data Dessert

29 min • 15 mars 2019

Christoph and Nate discuss the flavor of pure data.

  • "The reduction of the good stuff."
  • "We filter the points and reduce the good ones."
  • Concept 1: To use the power of Clojure core, you give it functions as the "vocabulary" to describe your data.
    • "predicate" function: produce truth values about your data
    • "view" or "extractor" function: returns a subset or calculated value from your data
    • "mapper" function: transforms your data into different data
    • "reduction" (or "reducer") function: combines your data together
  • Concept 2: Don't ignore the linguistic aspect of how you name your functions.
    • Reading the code can describe what it is doing.
    • Good naming is for humans. Clojure doesn't care.
  • Concept 3: Transform the data source into a big "bag" data that is true to structure and information of the source.
    • Source data describe the source information well and is not concerned with the processing aspects.
    • Transform into data that is useful for processing.
  • Concept 4: Using loop + recur for data transform is a code smell.
    • Not composable: encourages shoving everything together in one place.
    • "End up with a ball of mud instead of a bag of data you can sift through."
    • "You know what mud sticks to really well? More mud! It's very cohesive! And what couldn't be better than cohesive programs!"
  • Concept 5: Use loop + recur for recursion or blocking operations (like core.async)
    • Data shows up asynchronously
    • Useful when logic is more naturally expressed as recursion than filter + map + reduce.
  • Concept 6: Duality: stepwise vs aggregate
    • Stepwise problem: advance a game state, apply async event, stream processing, etc.
    • Stepwise: reduce, loop + recur
    • Aggregate problem: selecting the right data and combining it together.
    • Aggregate: filter + map + reduce
    • Aggregate problems tend to be eager--they want to process the whole data set.
  • Concept 7: Use your bag of granular data to work toward a bag of higher-level data.
    • We went from lines → entries → days → weeks
    • "Each level of data allows you to answer different questions."
  • Concept 8: Duality: higher-level data vs granular data with lots of dimensions
    • Eg. having a single "day" record vs a bunch of "entry" records that all share the same "date" field.
    • The "right" choice depends on your usage pattern.
    • Dimensional data tends to stay flat, but high-level data tends toward nesting.
    • A high-level record is a pre-calculated answer you can use over and over quickly.
    • Highly-dimensional, granular record allows you to "ask" questions spanning arbitrary dimensions. Eg. "What weeknights in January did I work past midnight?"
  • Concept 9: Keep it pure. Avoid side effects as much as possible.
    • Pure functions are the bedrock of functional programming.
    • REPL and unit test friendly.
    • "You can use data without hidden attachments. You remember side effects when you're writing them, but you don't remember them three months later."
  • Concept 10: Keep I/O at the "edges" with pure functions in the "middle".
    • "I/O should be performed by functions that you didn't write."
    • Use pure functions to format your data so you only have to hand it off to the I/O function. Eg. Create a list of "line" strings to emit with (run! println lines).
    • You can describe your I/O operations in data and make a "boring" function that just follows them. This allows you to unit test the complicated logic that determines the operations.
    • Separates out I/O specific problems from business logic problem: eg. retries, I/O exceptions, etc.

Related episodes:

Clojure in this episode:

  • filter, map, reduce
  • loop, recur
  • group-by
  • run!
  • println
Förekommer på
00:00 -00:00