Refactoring Analysis Guide¶
This document describes the graph analysis capabilities in repo-ctx that support code refactoring and modernization projects. It explains which analyses are most valuable, how to interpret results, and how to use them for refactoring decisions.
Table of Contents¶
- Refactoring Workflow
- Analysis Levels
- Graph Types
- Coupling Metrics
- Cycle Detection
- Practical Recommendations
- Implementation Status
Refactoring Workflow¶
A typical refactoring/modernization project follows these phases, each requiring specific information:
| Phase | Question to Answer | Required Analysis |
|---|---|---|
| 1. Assessment | "What's the current state?" | Architecture overview, dependency graphs |
| 2. Identification | "What needs to change?" | Coupling hotspots, code smells, cycles |
| 3. Prioritization | "What's risky vs. safe?" | Coupling metrics, stability analysis |
| 4. Planning | "How to structure the target?" | Natural boundaries, cohesion groups |
| 5. Execution | "If I change X, what breaks?" | Reverse dependencies, impact analysis |
Analysis Levels¶
Dependency analysis can be performed at different granularity levels. Each level serves different purposes in the refactoring process.
Level Comparison¶
| Level | Granularity | Unit of Analysis | Best For |
|---|---|---|---|
| Package/Module | Coarsest | Directory/namespace | High-level architecture, module extraction |
| File | Medium | Source file | Small projects, import analysis |
| Class | Fine | Class/Interface | OOP refactoring, coupling analysis |
| Function/Method | Finest | Callable | Call graph, impact analysis |
Recommended Level by Project Size¶
| Project Size | Primary Level | Secondary Level |
|---|---|---|
| Small (<50 files) | File | Function |
| Medium (50-500 files) | Class | Package |
| Large (>500 files) | Package | Class |
Why Class-Level is Most Valuable for Refactoring¶
- Classes are the unit of change: Most refactoring patterns (Extract Class, Move Method, Extract Interface) operate at class level
- Classes have semantic meaning: Unlike files, classes represent concepts
- Classes define responsibilities: Coupling between classes reveals responsibility boundaries
- OOP design principles apply: SOLID principles are class-level concerns
Graph Types¶
1. Dependency Graph¶
Shows directed dependencies between components.
File-Level (current implementation):
Class-Level (high value for refactoring):
Dependency Types:
- imports: Module/file import
- extends: Inheritance relationship
- implements: Interface implementation
- uses: Field type, parameter, return type
- calls: Method invocation
- creates: Object instantiation
2. Inheritance Tree¶
Shows class hierarchy relationships.
BaseEntity
├── User
│ ├── AdminUser
│ └── GuestUser
└── Product
├── PhysicalProduct
└── DigitalProduct
Refactoring Signals: - Depth > 4 levels: Consider flattening with composition - Width > 10 children: Missing abstraction layer - Empty subclasses: Unnecessary inheritance - Diamond patterns: Complexity smell
3. Call Graph¶
Shows function/method invocation relationships.
main()
├── initialize()
│ ├── loadConfig()
│ └── connectDB()
└── processRequest()
├── validateInput()
├── executeLogic()
│ └── queryDatabase()
└── formatResponse()
Refactoring Signals: - High fan-in (many callers): Core function, change carefully - High fan-out (many callees): God function, consider splitting - No callers: Potentially dead code - Recursive cycles: Usually intentional
4. Package/Module Graph¶
Shows dependencies between packages/directories.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ api/ │────>│ services/ │────>│ models/ │
└─────────────┘ └─────────────┘ └─────────────┘
│
v
┌─────────────┐
│ utils/ │
└─────────────┘
Refactoring Signals: - Cycles between packages: Architectural problem - Many cross-package dependencies: Poor modularization - Utils depending on business logic: Inverted dependency
Coupling Metrics¶
Coupling metrics quantify how connected components are, helping prioritize refactoring efforts.
Core Metrics¶
| Metric | Formula | Meaning |
|---|---|---|
| Ca (Afferent Coupling) | Count of incoming dependencies | How many depend ON this |
| Ce (Efferent Coupling) | Count of outgoing dependencies | How many does this DEPEND on |
| Instability | Ce / (Ca + Ce) | 0.0 = stable, 1.0 = unstable |
| Abstractness | Abstract types / Total types | Ratio of abstract classes |
| Distance from Main Sequence | |A + I - 1| | 0.0 = ideal balance |
Interpreting Instability¶
Instability = Ce / (Ca + Ce)
0.0 0.5 1.0
│ │ │
┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
│ STABLE │ │ BALANCED │ │ UNSTABLE │
│ │ │ │ │ │
│ Many depend │ │ Moderate │ │ Few depend │
│ on this │ │ coupling │ │ on this │
│ │ │ │ │ │
│ HARD to │ │ │ │ EASY to │
│ change │ │ │ │ change │
└─────────────┘ └─────────────┘ └─────────────┘
Refactoring Decision Matrix¶
| Instability | Abstractness | Zone | Meaning | Action |
|---|---|---|---|---|
| Low (0.0-0.3) | High (0.7-1.0) | Ideal | Stable abstraction | Leave alone |
| Low | Low | Zone of Pain | Rigid, hard to change | Add interfaces |
| High (0.7-1.0) | Low | Normal | Volatile implementation | OK if intentional |
| High | High | Zone of Uselessness | Over-abstracted | Simplify |
Practical Thresholds¶
| Metric | Healthy | Warning | Critical |
|---|---|---|---|
| Ca (Afferent) | < 10 | 10-20 | > 20 |
| Ce (Efferent) | < 10 | 10-15 | > 15 |
| Class size (methods) | < 20 | 20-30 | > 30 |
| Method parameters | < 4 | 4-6 | > 6 |
Cycle Detection¶
Cycles in dependencies are architectural smells that should be eliminated.
Cycle Severity by Level¶
| Level | Severity | Example | Impact |
|---|---|---|---|
| Package | CRITICAL | api/ ↔ services/ |
Cannot deploy independently |
| Class | HIGH | UserService ↔ OrderService |
Tight coupling, hard to test |
| Function | LOW | Mutual recursion | Often intentional |
Cycle Types¶
Direct Cycle (2 nodes):
Transitive Cycle (3+ nodes):
Self-Loop:
Breaking Cycles¶
Strategy 1: Extract Interface
Strategy 2: Dependency Injection
Before: ServiceA creates ServiceB, ServiceB creates ServiceA
After: Container creates both, injects dependencies
Strategy 3: Extract Shared Module
Strategy 4: Event-Based Decoupling
Practical Recommendations¶
High-Priority Refactoring Targets¶
- God Classes (Ca > 20, Ce > 15)
- Many depend on it, it depends on many
-
Split by responsibility
-
Unstable Core (Instability > 0.7, Ca > 10)
- Volatile but many dependents
-
Stabilize with interfaces
-
Package Cycles
- Any cycle at package level
-
Break immediately
-
Deep Hierarchies (depth > 4)
- Hard to understand and modify
-
Flatten with composition
-
Bidirectional Dependencies
- Two classes depending on each other
- Extract interface or mediator
Output Format for Refactoring Reports¶
## Refactoring Recommendations
### Critical Issues
1. **Package cycle detected**: `services` ↔ `models` ↔ `utils`
- Impact: Cannot deploy independently
- Suggestion: Extract shared types to `common` package
### High-Priority Targets
1. **God class**: `UserManager` (Ca=23, Ce=15)
- 23 classes depend on it, touches 15 others
- Suggestion: Split by responsibility
2. **Unstable core**: `DatabaseConnection` (Instability=0.8, Ca=45)
- 45 dependents but high instability
- Suggestion: Add interface, reduce direct dependencies
### Improvement Opportunities
1. **Deep hierarchy**: `BaseWidget` → 6 levels deep
- Suggestion: Flatten using composition
2. **Tight coupling**: `OrderService` ↔ `PaymentService`
- Bidirectional dependency
- Suggestion: Extract `OrderPaymentMediator`
Implementation Status¶
Currently Implemented¶
| Feature | Level | Status |
|---|---|---|
| Dependency Graph | File | ✅ Implemented |
| Cycle Detection | File | ✅ Implemented |
| Layer Detection | File | ✅ Implemented |
| DSM (Design Structure Matrix) | File | ✅ Implemented |
| Mermaid Diagrams | File | ✅ Implemented |
Planned Implementation¶
| Feature | Level | Priority |
|---|---|---|
| Dependency Graph | Class | P1 - Essential |
| Coupling Metrics (Ca/Ce) | Class | P1 - Essential |
| Cycle Detection | Class, Package | P1 - Essential |
| Inheritance Tree | Class | P2 - Important |
| Call Graph | Function | P2 - Important |
| Dependency Graph | Package | P2 - Important |
| Cohesion Analysis | Class | P3 - Nice to have |
References¶
- Martin, R. (2003). Agile Software Development: Principles, Patterns, and Practices
- Lakos, J. (1996). Large-Scale C++ Software Design
- Fowler, M. (2018). Refactoring: Improving the Design of Existing Code
- Bass, L., Clements, P., Kazman, R. (2012). Software Architecture in Practice