Each One Correct, Together They Explode: Contracts and Integration
You have given each Agent its own independent workspace and set clear responsibility boundaries. Agent A passed all tests in the payments module; Agent B passed all tests in the billing module. You merge both branches into the trunk, run integration tests, and they fail.
Agent A's payment interface returns an amount field called total_amount; Agent B's billing module expects to receive a field called payment_amount. Each Agent's implementation is functionally correct, but their understanding of the interface is inconsistent. This type of problem rarely occurs in single-Agent scenarios because the same Agent naturally maintains naming consistency within the same context. In multi-Agent scenarios, each Agent works within its own context, and unless there are explicit contractual constraints, their assumptions may diverge on any detail.
The API Contract introduced in Chapter 2 takes on a new role here. In single-Agent scenarios, the Contract is a communication tool between you and the Agent, ensuring the Agent understands the interface requirements. In multi-Agent scenarios, the Contract becomes the only coordination channel between Agents. Agents do not converse with each other; there is no real-time sync. The Contract is their only basis for consensus. This raises the bar significantly for Contracts: field naming, data formats, error codes, state transitions, every detail must be explicitly defined in the Contract. With a single Agent, the Contract can be moderately vague, with human and Agent clarifying in conversation. With multiple Agents, every ambiguity in the Contract is a potential integration failure point.
The integration testing discussed in Chapter 3 shifts from "recommended" to "mandatory" here. In single-Agent scenarios, Trophy tests are your best tool for verifying output quality. In multi-Agent scenarios, integration tests are your only tool for discovering inconsistencies between Agents. Module-level tests can only verify a module's own correctness; they cannot verify whether assumptions across modules are aligned.
The frequency of continuous integration also needs to increase. Traditional teams might integrate once a day. In multi-Agent parallel development, every time an Agent completes a task chunk, it should trigger an integration test. The earlier integration issues are found, the lower the fix cost. If two Agents each run for a full day before integrating, the problems discovered might require throwing out a day's work. If integration happens every hour, the worst case is rolling back an hour.