In episode 15, Pete Welcher and Chris Kane join us to talk about what exactly characterizes a well run network. Is it great documentation? Is it consistent application of best practices? Maybe it’s process and procedure? Join our guests, and the decades of experience they bring, as they sit around the virtual roundtable to share their thoughts on the topic.
Show Notes
Design
- Keep it simple. Just because you can, doesn’t mean you should
- Complexity is not just a networking attribute, it’s an overall system attribute
- Proper design leads to simplicity (most of the time)
- What technologies are simple? How do you recognize complexity? Is vendor lock-in one indicator? Network management software / API lock-in another growing one?
- There are some pretty simple campus/user, datacenter, Internet Edge, and WAN approaches Big organizations can handle and may need a bit more complexity — or not
- Modularity- too many moving parts that have to work together = complex
Operations
- Transparent Network. It just works
- Able to easily implement changes
- Agnostic to both today’s needs and flexible to absorb tomorrow’s needs
- Up-to-date diagrams and documentation matter
- Organized around OSI layers
- Documented naming conventions with fixed fields
- MTTR e.g. from NetMRI and some other tools
- Up-to-date code
- Common code across any given router or switch model Measurement
- There is no way to tell how the network is performing currently as compared to previous performance unless data is collected at regular intervals
- Ideally is done via more sophisticated tooling that not only gathers the data but also performs benchmarks and reports on anomalies
- Network awareness among engineers
- Interpersonal communication and team collaboration
Lab Testing
- Used to be sacrificed as a CAPEX or OPEX that was cost prohibitive because of the vast resources needed to provide test points outside, on the edges, of the network
- All network designers and operators should expect/select OEMs that provide a virtual edition of their hardware
- These software-only solutions should be used to re-create at least a subset if not all of the infrastructure. Then this virtual environment can be used to rehearse upcoming changes and their possible effects/outcomes.
- This should also be used to verify that backups are not only occurring but are actually useful. Much like an application or database backup should be regularly tested, so should the network backups.
Change Management
- Wear a black concert T-shirt during all Maintenance Windows
- Listen to Grunge music as IP truly came of money-making age during that era and the music soothes the network.
- If in a datacenter, noise-cancelling headphones and conf call helps. Sep call/webex with 30 minute updates for execs (prevents constant second-guessing and distraction), and liaison on both the tech and exec call, jotting notes for periodic exec update
- Have an Implementation Plan
- Tabs in XLS, including contact info, pre-change configs, configlets to blow in, test plan, backout configlets. All SSH connections opened prior to change window.
- Peer Review and Sign Off of Implementation Plan
- Take Pre-Change Snapshots/Health Checks
- Take Post-Change Snapshots/Health Checks
- Insist upon user/app owner User Acceptance Testing to occur prior to and after completion of the change.
- Use lab network
- When selecting between OEM solutions you absolutely must conduct a Proof-of-Concept as reading of data sheets