Code Complete, 2nd by Steve McConnell

cover.png

Consider one book that would try to compile all best practices for all past years for software construction. With about 450 items in bibliography of most important publications at the time (second edition published in 2004 y.) this book endeavours to bring development process at a totally new level. Essentially this book interprets most important those publications and gives guidelines for building software. It takes recommendations of gurus in the field and scientists and presents practical rules for effectively building quality software thus narrowing the gap between production and research. Hence the title of the book which implies that it is some complete set of laws.

The book is divided into seven parts. Giving some base terms and ideas in part I, author goes from the very bottom of construction process, the code itself (parts II, III, IV) to upper processes. Part V is about working with existing code. Controlling system development is in part VI. And the last part VII is about requirements to creator itself. For quick use the book includes checklists for all stages of construction and many figures and tables as well as key terms. Thus can be used as a quick reference.

Below are my notes from the book (unfinished).

Part I. Foundation - Code Complete

Chapter 1. Welcome to Software construction

  • 1.1 What is Software Construction?
    • Construction(programming) - process of building includes:
      • coding and debugging
      • unit testing
      • integration
      • integration testing
      • detailed design
      • construction planning
    • Not construction:
      • problem definition
      • system testing
      • documenting requirements
      • software architecture
    • Coding - putting design into code, mechanically.

Chapter 3. Measure Twice, Cut Once: Upstream prerequisites.

  • Introduction
    • Doing prerequisites is a must for high-quality software. Prerequisites can take up to 65% of total cost.
  • 3.1 Importance of Prerequisites
    • Important to define prerequisites at the beginning to get away from trouble.
    • Make prerequisites as they reduce risk of failure.
    • The earlier an error found, cheaper to fix it.
  • 3.2 Determine the kind of software you’re working on
    • Determine the kind to get balance between preparation and construction
      • 3 kinds of projects (soft) and their development characteristic.
    • Make prerequisites in iterative approach
    • The earlier error is found, the cheaper to fix it. (I principle)
      • Focus on prerequisites in iterative and in sequential approach
    • Choose approaches:
      • 80 prereq. before, 20 later (sequan.)
      • 20 prereq. before, 80 later (iterat.)
    • How to choose between sequential and iterative approaches
  • 3.3 Problem-Definition Prerequisite
    • Define the problem from user’s point of view in user’s language
  • 3.4 Requirements Prerequisite
    • Make official requirements (Spec doc) in user’s language:
      • Explicit req. make user to choose
      • Avoiding arguing
      • Lowers cost (see I principle)
    • (He won’t tell about how to make good req)
    • Beware — requirements will change
    • How to handle changing requirements during construction
      • Assess the req. themselfs using the checklist in Chapter.
      • Make everyone know the cost of req. change
      • Have procedure-control procedure for changes
      • Use iterative appr. to get feedback quickly
      • Can’t get on with change request? Dump project
  • 3.5 Architecture Prerequisite
    • Have a good architecture before construction
    • Here are common components of arch. doc:
      • Make system overview, state alternatives and reasons to choose between them
        • state building blocks (classes or modules) (should correspond to requirements)
        • communication between them
      • Specify major classes (20% classes makes 80% of func) and rationale for them
      • Data design: specify main tables and files.
      • Describe an impact of business rules on architecture
      • Specify UI if it is not done in SRS
      • Error Processing
        • Should be treated at architecure level: important
        • Choose strategies:
          • corrective (tries to recover) vs detective
          • active (warn beforehand) or passive (report on failure)
          • how to propagate errors (discard data, process error, tell later)
          • how to handle error messages to user (UI)
          • how to handle errors (logging, throwing, catching)
          • at what level to handle error (up in call chain, in separate class)?
          • define level of responsibility for class (see Defensive Programming)
          • Use Default Error strategies or create ?
        • Choose techniques to handle and recover from error (fault tolerance)
      • State and reason things that impossible to implement (unfeasible)
      • Overengineering(TBD - didn’t understand)
      • State why custom-built component is better then component can be bought
      • Indicate strategies to handle possible future changes: what is likely to change and how it will be handled
    • Architecture should be consistent
    • Architecture doc should have history
    • Architecture should have objectives
    • Arch. should indicate motivations
    • Arch. should be as machine/language independent as possible
    • Arch. doc should be understandable
  • 3.6 Amount of Time to Spend on Upstream Prerequisites
    • Only upstream prereq., usually it is up to 30% of time (schedule)
    • On large project req. analyst is needed. On small — not
    • Requirements works should be scheduled as a different project (architecture design can be also)

Part II. Creating High-Quality Code - Code Complete

Chapter 5. Design in Construction

  • Introduction
    • Design should be done on all type of projects, from smaller one to bigger one: all projects will benefit
  • 5.1 Design Challenges
    • Design is invention of a scheme that would put requirements into operational soft.
    • Design is a wicked problem: design could not be clearly defined before it is solved.
      Design (process) is full of mistakes and open-ended (sloppy).
    • And other challenges: it is always heuristics (invention), restrictions, nondeterministic (many different solutions), emergent
  • 5.2 Key Design Concepts
    • Manage complexity, break system into simple pieces. So that I can focus on one part while not thinking about other parts. If it is not it is a bad design.
    • At first minimize complexity of essential then accidental. Essential is a problem itself that should be solved (domain, real world concepts). Accidental is IDE, programming language, OS, tools, etc.
    • Try to hit the following goals of design: managing complexity, expandability, reusability, layers, maintainability, loose coupling
    • Make design in levels.
      • Level 1. System. Nothing to design?
      • Level 2. Division on subsystems or packages (UI, business rules, database).
      • Subsystem or package — groups of objects (classes). What are they and how they can communicate.
      • Minimize number of connections. Each should have a reason. Degrees for connections from light to heavy: call of a routine, containing class, inheriting class.
        • Common subsystems:
          • Business rules. Laws on domain. Such as number of vacation days.
          • UI (asp.net WebForms, razor views)
          • Database access (sql server calls, mysql server calls)
          • System dependencies (windows os calls)
      • Level 3. Division on classes. What they are, what connections they have and what interfaces? Key point: distinguish between objects and classes.
      • Level 4. Division on routines and data. What all routines of a class are?
      • Level 5. In routine. What paragraphs, pseudocode, algorithms.
  • 5.3 Design Building Blocks: Heuristics
    • Heuristics — way to think about design that produce a good one.
    • “By the book” or real-world object approach:
      • Determine objects
      • Determine what can be done to objects
      • Determine what objects can do to other objects: contain or inherit.
      • Determine what should be hidden and what shown (public and protected iterfaces)
    • Form consistent abstractions
      • Then I can focus on interface rather then details. Example: house (details are doors, windows, etc), door (doorknob, wood), doorknob (wood fibers, molecules of steel or varnish, etc).
    • Encapsulate details that is forbid to look at complexity (What a door made with? A wood, glass, steel?) .
    • Use Inheritance when it simplifies design. Inheritance — defining similarities and differences between objects. Polymorphism — ability to deal with objects by language without knowing their specific kinds.
    • Hide secrets. Many scientists say it is important to hide information.
      • Decide what part of a class is visible (interface) and what is not. It hides more then reveals.
      • Hide details: file formats, data types used, etc.
      • Hide information on all levels: data types, routines signatures, classes, packages.
      • Possible barriers to hiding:
        • Performance issues due to wrapping info in classes or calling routines.
          First hide then determine bottlenecks if needed. It will be easy to make optimize highly modular system.
      • “What should I hide?” - should be asked while designing an interface and other
    • Identify and isolate objects likely to change. Like 1) Business logic; 2) hardware dependencies; 3) input/output; 4) complicated design; 5) unstable language features; and other
    • Keep coupling loose
      • Module — a routine or a class.
      • Make coupling loose as simple as possible (e.g. sin(angle) -> initVars(var1,var2,var3) -> modules use same global var) (see coupling kinds)
      • Criteria:
        • Size — number of connections (parameters in routines, members in class)
        • Visibility — making connection obvious (passing parameter instead of global var)
        • Flexibility — how easily a module can be called (if referring to third object then it is not as easy as passing only needed data of that object, see example).
      • Coupling is not loose if you think about more things (global data, inner func of other modules, etc) when creating a module.
    • Look for common Design Patterns. Patterns pluses listed.
    • Aim for strong cohesion. Cohesion — how strong all code of a module aims to a purpose.
    • Formalize class contracts. (Debug.Assert…)
  • Design practices.
    • Iterate. From high-level to down-level and back (Top-down), from abstractions to details and back. Or start with down-level (Bottom-Up).
    • Divide program
    • Try top-down or bottom-up approaches. They have weaknesses and advantages. Top-down more essential. Use bottom-up when can’t make abstracts.
    • Make prototypes if stuck. Prototype — a throwaway code that will answer specific design question (performance e.g.). See common problems with prototypes.
    • Collaborate.
    • Check McConnell advices on how much design is enough (experience? = low. Mission critical? = medium.)
    • Capture the work (see tools and techniques)

Chapter 6. Working Classes.

  • Introduction
    • Create cohesive classes. One responsibility per class. They allow to work on a part of a program without knowing of other parts.
  • 6.1 Class Foundations: Abstract Data Types (ADTs)
    • ADT - a data and operations on it. Operations describe data to other parts and allow operations on it.
    • Think in terms of ADT. Not “a node added to a list” but “a cell added to a spreadsheet” or “a passenger to a car simulation”.
    • Treat low-level data types (lists e.g.), common objects (e.g. files), simple items (e.g. light on/off) as ADT.
    • Treat ADT independently from where they are stored.
  • 6.2 Good class interface
    • Good abstraction:
      • Good: the one that represent one abstraction and hides details.
      • Bad: poor cohesion, no connections between members.
      • Persist consistent level of abstraction. Bad design: inherit to allow some operation but the operation would not be connected to a type.
      • Be sure you know what abstraction is needed. E.g. exposing too many operations.
      • Check that you provide pair services (on and off, add and remove, etc.) if needed.
      • Check that there are no two abstracts in one class.
      • ? how to use programatic instead of sematic interface?
      • Check abstraction/cohesion when adding a new member to avoid interface erosion
    • Good encapsulation
      • Hide as much as possible. Make level of accessability low. Do not if it is consistent with abstraction.
      • Don’t expose private members.
      • Don’t include private members to interface (C++ only?)
      • Don’t make assumption on how the interface would be used
      • Check that class’s interface is understandable and not requires to look into inner details. Avoid assumptions on inner work from client code.
  • 6.3 Design and implementation issues
    • Containment (“has a”) — idea about an object containing other objects or primitive types.
      • Prefer 7+(-)2 members of object: easy to remember (studies from psy).
    • Inheritance (“is a”) — idea that a class is a specialization of another class.
      • Comply to Liskov Substitution Principle (LSP) when using inheritance. LSP — subtypes must be possible to use as base classes w/o knowing their specialization.
      • Use inheritance only if a subclass is truly a specialization of another class.
      • Be critical if only on derived class, no implementation in overridden routine.
      • Replace switch/case with polymorphism if it doesn’t break abstraction (creation of meaningless members like DoCommand)
    • Don’t use multiple inheritance.
    • On members:
      • minimize classes intantiated, external routines calls, number of members, implicit members (e.g. private constructor) and other. Don’t fan out — spread like fan (touch many things around).
    • On constructors:
      • Initialize all members in all constructors ?
      • About singleton and private constructor
      • About shallow and deep copies (reference and members copies)
    • Reasons to create class:
      • model real world objects
      • hide info
        • complex algorithm
        • access to database, files or smth
      • to reuse functionality

Chapter 7. High-Quality Routines

  • Introduction
    • Routine – is a method callable for a single purpose.
    • (See example of bad routine)
  • 7.1 Valid reason to create a routine
    • Create routine in order to
      • reduce complexity (e.g. extraction of a method) (the most important reason) (even if it is two or three lines!)
      • give a name to abstraction
      • avoid duplicates
      • hide sequences (e.g. PopStack() for getting top and decrementing topStack variable).
  • 7.2 Design at the Routine level
    • Aim for as cohesive routines as possible (Cosine() is more cohesive than CosineAndTan())
    • Write routines that do one and only one operation (functional cohesion).
    • There are the following acceptable cohesive procedures: functional, sequential, communicational and temporal cohesion. (???) There are unacceptable cohesion of routines. (???) (TODO: check this).
  • 7.3 Good routines names.
    • Describe priciesly what it does
    • Avoid meaningless names
    • Describe what function returns
    • Use verb for what procedure does on objects. Don’t use object names: it’s passed to procedure.
  • 7.4 Routines length
    • Consider not to write long routines (less important then other complexities such as number of variables and other)
  • 7.5 How to use routing parameters
    • … (order, types and other)
  • 7.6 Special Considerations in the Use of Functions
    • Procedure is a routine that returns a value and function is a routine that does not.
    • Don’t mess functions with procedures (like if(FormatOutput(report) == success) \\smth )
  • 7.7 Macro Routines and Inline Routines
    • … (Mostly advices for C++)

Chapter 8 Defensive Programming

  • Intro
    • Defensive programming is writing code (procedures) that protects itself from bad input (other code or users).
  • 8.1 Protecting Your Programs from Invalid Inputs
    • GIGO (garbage in, garbage out) is process of giving bad input and receiving bad output even if the algorithm is right.
    • Don’t make GIGO. Use garbage in, nothing out or error out or no garbage in.
    • Check all input from external sources
    • Check all input for routines
  • 8.2 Assertions
    • Assertion is code that checks errors that should never occur (or error in my code).
    • Use for assumptions
    • Use for errors that should never happen, use error-handling for errors that can happen.
    • Use for checking and documenting preconditions (promises of client code) and postconditions (promises of host code).
    • If client code is trusted (another class) then assert, if not (input data) then use error-handling.
    • If app should be robust (long living ones) then do both assertions and error-handling.
  • 8.3 Error-Handling Techniques [in routines or in whole system] (checks errors that should occur, or error in outer(?) data)
    • There are number of them: return closest legal value, return error code (exception), display UI message, call other error handling routing, log message and other.
    • Only correctness or only robustness can be chosen for error handling techniques. That is what to return? Nothing? Error message? Or fix invalid value?
    • Decide which technique to choose on a high design level
  • 8.4 Exceptions
    • Exception is means to say “I don’t know what to do with this case”. The second way to use them is to create some workflow.
    • Use to notify other parts about an error.
    • Exceptions can increase complexity by requiring calling code to know about exceptions.
    • Handle error locally if it can be handled there, don’t pass it to other code.
    • Throw exceptions on the same level of abstraction as interface (example: EndOfFileException in GetTaxiID() method)
    • Handle exceptions that external libraries used throw
    • Consider to use centralized exception reporter (there are trade offs)
    • Consider to create my own exception base class.
    • Consider to use other error-handling techniques or shut all down and release resources.
  • 8.5 Barricade Your Program to Contain the Damage Caused by Errors
    • At packages level create validation classes. They pass dirty data to classes supposed to handle clean data (firewall in terms of OOP).
    • At class level write public members that assume data is dirty, private members — data is clean.
    • Covert questionable data to proper data types as soon as possible (e.g. string “true” to true boolean)
    • Use assertions inside barricade (my code is wrong) and use error-handling techniques outside barricade (data is wrong).
  • 8.6 Debugging Aids (не понял)
    • Use more resources to add debugging aids .. (What is debugging aids? Code that run on dev only?)
    • Fail hard at development (e.g. throw exception in switch/case) to fail soft at production.
    • Remove debugging aids (preprocessor)
  • 8.7

Chapter 9. Pseudocode Programming Process (PPP). (Выглядит избыточно 25.09.2013 13:01)

  • Intro
    • PPP is better than Test-driven development (see 9.4). PPP makes it easier to write routines and classes with details. It writes documentation altogether.
  • 9.1 Summary of building classes and routines.
    • This is iterative process. Details in routine bring up new routines.
  • 9.2 Pseudo Code for Pros
    • PPP — process to design a class, a routine, a program using pseudocode (English like notation describing algorithms)
    • PPP is important.
    • Guidelines:
      • Avoid low level details but write at low enough level to get statements quickly
      • Write intents
      • Iterate
    • Benefits …
  • 9.3 Constructing routines by using PPP
    • Design using PPP, testing using unit tests. Interface is got from PPP.
  • 9.4 Alternatives…
  • Notes:

Words:

  • Heuristic - эвристический, изобретательный
  • Sloppy - небрежный, неряшливый
  • Cohesive - связанный, единый.
  • Cohesion – единство, спаянность.
  • Deficiency - нехватка.

Part III. Variables

Chapter 10. General Issues in Using Variables

  • 10.1 Data Literacy
  • 10.2 Making variable declarations easy. Don’t use implicit variable declarations.
  • 10.3 Guidelines for initializing variables
    • Invalid initializing leads to debugging horror
    • (Advice for C++ and VB mostly. Skipping most. Mostly this is not a problem with R# and modern IDEs)
    • Declare variables and initialize directly before their first use.
    • Initialize all members in constructor
    • Know you initialization variable value
  • 10.4 Scope
    • Scope — is how popular variable is (global, local, class, etc).
    • Span of variable — sum of statements between references of a variable.
    • Live time — Difference between first and last statements with a variable.
    • Keep the span, scope and live time as short as possible: less vulnerable, easier to see, less initialization errors, more readable, easier to refactor.
    • Guidelines: variables for loop at beginning of loop, group variables usage, extract methods of the groups, expand scope as a last resort.
    • All this guidelines are for readability purposes. It is easy to write programs when all variables are global but it is harder to read it.
  • 10.5 Persistence
    • Beware of variable life time. Problems from using variable of shorter life then expected. (And smth other)
  • 10.6 Binding Time
    • Binding time is a time when variable and its value are bound together.
    • Types:
      • When code written, hard-coded
      • At compile time
      • Load time (e.g. at soft start)
      • Object initialization (e.g. at window init)
      • Each time (e.g. at window drawn)
    • The later types the more flexible, but more complex. Ensure to choose reasonably between flexibility and complexity.
  • 10.7 Relationship Between Data Types and Control Structures
    • Sequential data type (user’s age, name, profession) corresponds to sequential code structure. Selective — selective. Iterative (array) — iterative.
  • 10.8 Using each variable exactly for one purpose.
    • Use a variable exactly for one purpose
    • Don’t hide variable purpose by using it to store two meanings
    • Remove unused variables.

Chapter 11. The Power of Variable Names

  • 11.1. Considerations in Choosing Good Names
    • Choose naming fully, accurately and as specific as possible.
    • Choose name in terms on ‘what’ rather than ‘how’, or in terms of problem domain.
    • … (about how long it should be and other)
  • … (About conversions and other)
  • Code is read far more times then it is written. Name variables for readability.

Chapter 12. Fundamental Data Types

  • Use in enumerations the first and the last members to indicate start and end of a loop.
    enum { First=0, ColorBlur=0, ColorLightBlue=1, ColorRed =2, Last = 2 }
  • Use new types (structures) for data that might change.
  • Use check list in the paragraph.

Chapter 13. Unusual Data Types.

  • 13.1 Structures… (They are unusual because it is a mix of basic types)
  • 13.2 Pointers … (pointers overview, tips for C like languages).
  • 13.3 Global data … (Don’t use it unless it can’t be otherwise. In that case use routine to access it.)

Part V. Code improvements. - Code Complete

Chapter 22. Developer Testing

  • Introduction
    • Testing described. It divided onto black and white boxes testing. Testing is not debugging (finding reason of bug and correcting it)
    • Testing:
      • unit (testing of unit),
      • component (same as unit but developed by multiple programmers),
      • integration (how components works together),
      • regression (retesting of tested)
      • and system testings (tests in environment, system that software supposed to work in)
  • 22.1. Role of Developer Testing in Software Quality
    • About how efficient developer testing is, what is hard in developer testing, how much time should it take, what to do with the results of testing, about advantages of testing during construction.
    • Testing is not as efficient as collaborating development
    • Difficulties:
      • testing countering goals of construction: find error VS don’t do error
      • tested <> there are not bugs
      • testing = measurement, not improvement of quality
      • Not hoping to find error = no errors found. Should hope to find errors = errors found.
    • Testing (not debugging) should take 5-25% of construction.
    • Results of testing: in assessment of soft; in corrections; in guidelines to become better developer.
    • Testing should be done during construction (test first or test after): by routine; glass box allows better findings.
  • 22.2 Recommended Approach to Developer Testing
    • Write test cases at prerequisites/design/construction stages
    • It is cheaper to fix errors earlier.
    • Better to test first because:
      • defects in prerequisites are found earlier = cheaper
      • design tends to be better
      • (more)
  • Developer testing is limited:
    • Tends to write test that code work (clean) not that the code does not work (dirty)
    • Assessment of code coverage is wrong (lower). In avg it is about 50%.
    • (one more)
  • 22.3 Bag of testing tricks
    • It is almost impossible to test all test cases. There are too many test cases (case is unique set of input data).
    • “Coverage monitor” tells how much code was covered.
    • Test cases should be picked up so that new things come. The set of new things is a base. (TBD: read again: how to test sofware?)
    • Methods to cover the base:
      • Structured basis testing (SBT): each statement is tested at least once. So calculate paths (if, while, for and other).
      • To test control flow.
      • (Logical (code) coverage testing has more test cases then SBT)
      • Data flow testing.
        • About data states: defined, used, killed (+ entered and exited). About wrong data states, e.g. defined-killed. About wrong data states, e.g. defined-killed.

Chapter 23. Debugging

  • Introduction
    • Debugging - process of finding error reason and correcting it. Can take up to 50% of project construction time.
  • 23.1 Overview of Debugging Issues
    • Debugging is not getting quality software! It is last resort: soft should be well defined, designed and constructed.
    • Debugging speed varies: There is effective and not effective debugging. Sometimes it is 13 times faster.
    • (TBD)