We talked about:
- Nadia’s background
- Academic research in software engineering
- Design patterns
- Software engineering for ML systems
- Problems that people in industry have with software engineering and ML
- Communication issues and setting requirements
- Artifact research in open source products
- Product vs model
- Nadia’s open source product dataset
- Failure points in machine learning projects
- Finding solutions to issues using Nadia’s dataset and experience
- The problem of siloing data scientists and other structure issues
- The importance of documentation and checklists
- Responsible AI
- How data scientists and software engineers can work in an Agile way
Links:
- Model Card: https://arxiv.org/abs/1810.03993
- Datasheets: https://arxiv.org/abs/1803.09010
- Factsheets: https://arxiv.org/abs/1808.07261
- Research Paper: https://www.cs.cmu.edu/~ckaestne/pdf/icse22_seai.pdf
- Arxiv version: https://arxiv.org/pdf/2110.
Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp
Join DataTalks.Club: https://datatalks.club/slack.html
Our events: https://datatalks.club/events.html