















## Parameterized library

Large NMOS pull-down network of domino gate.

Small short circuit current and small driven load.

No complementary part. The delay overhead of inverter may offset the advantage of fast switch speeds in small gates.

Dramatical increase of library number with the increase of length(s) and width(p) of gate. (s,p): (3,6): 6877; (4,4): 3503; (4,6): 222943

A parameterized library is applied for technology mapping of domino logic.















## Node cost functions

Here, cost is area -- the number of transistors.

Literal operation: C=C+1

Literal operation corresponds to a primary input or a situation where a new domino structure is started after gate formation operation.

OR/AND operation: C=Literal(C<sub>1</sub>) + Literal(C<sub>r</sub>)

Gate formation operation: C=C<sub>min</sub> +4

The minimal cost solution,  $C_{min}$  is the minimal value out of all H\*W optimal subsolutions

'4' includes two clock control transistors + an inverter













| plementation and results(1)                                          |                                |                                   |                                 |  |  |  |
|----------------------------------------------------------------------|--------------------------------|-----------------------------------|---------------------------------|--|--|--|
| xecution time: < 10 seconds<br>comparison with another domino mapper |                                |                                   |                                 |  |  |  |
| Circuits                                                             | Our approach<br>#trans/#level  | Prasad et al.<br>#trans/#level    | <b>Reduction</b><br>%           |  |  |  |
| c8<br>16<br>C880                                                     | 289/6<br>890/2<br>1056/9       | 328/7<br>890/3<br>1499/7          | 13.5%<br>0%<br>42.0%            |  |  |  |
| Compariso                                                            | on of various m                | napping methods                   |                                 |  |  |  |
| Circuits                                                             | Basic mapping<br>#trans/#level | Wide AND/OR gate<br>#trans/#level | Dual-mono gate<br>#trans/#level |  |  |  |
| C1355                                                                | 1824/9                         | 1824/9                            | 1360/7                          |  |  |  |
| C1908                                                                | 1978/18                        | 1965/18                           | 1588/14                         |  |  |  |
| k)                                                                   | 2884/16                        | 2738/15                           | 2884/16                         |  |  |  |

| Circuits Domino S15: 44-3.genuo F<br>#trans/#levels #trans/#levels | i6           | 10 515:44-3.genub Reaucion Dup-ra<br>Is #trans/#levels %<br>13 1194/5 36.3% 13<br>17 1378/20 1.3% 77 |
|--------------------------------------------------------------------|--------------|------------------------------------------------------------------------------------------------------|
| <i>i6</i> 761/3 1194/5                                             | i6           | /3 1194/5 36.3% 13<br>/7 1378/20 1.3% 77                                                             |
| 10 10115 119115                                                    |              | 7 1378/20 1.3% 77                                                                                    |
| <b>C1355</b> 1360/7 1378/20                                        | <i>C1355</i> |                                                                                                      |
| <b>C3540</b> 4002/20 3140/34                                       | C3540        | 20 3140/34 -27.5% 92                                                                                 |







## The timing-driven static-domino partitioning algorithm

Cost: area or power.

Outline of the algorithm

Perform fast static and domino mapping on the entire logic network.

Apply a PERT based timing analysis method to find the candidate cut nodes in the network.

Build the flow network from the candidate cut nodes. The edge capacities are determined from the cost difference of static and domino implementations.















|        | Henry | Static<br>######### | No spec        | Spec=(*1.25)  | Spec=(*1.05)          | CPU         |
|--------|-------|---------------------|----------------|---------------|-----------------------|-------------|
| C3540  | 4527  | 2850/1 43           | #irans<br>2748 | <i>#urans</i> | <i>#urans</i><br>3987 | (s)<br>10 9 |
| des    | 9945  | 8134/4.25           | 7527           | 7536          | 7536                  | 60.2        |
| C7552  | 7919  | 5464/2.35           | 5370           | 5987          | 6198                  | 30.9        |
| S Es   | 4.4   | 1 Artes             | 12             | キャンド          | 343437                | 13          |
| Seat R | 2     | and 2               | Sent R         | 24            | 8 Fred                | R           |
| " Kaka | Arres | R. K. M.            | C. Ka          | A STATE       | A Start Bar           | 20          |

| Partiti  | oning flo        | ow for two             | -phase cloc                     | king schei                      |
|----------|------------------|------------------------|---------------------------------|---------------------------------|
| Circuits | Domino<br>#trans | Static<br>#trans/delay | Spec=(*1.25)<br>#trans/#latches | Spec=(*1.05)<br>#trans/#latches |
| c2670    | 1992             | 1754/1.75              | 1538/52                         | 1538/52                         |
| K2       | 2884             | 2896/1.54              | 2691/157                        | 2795/115                        |
| C3540    | 4527             | 2850/1.43              | 3063/60                         | 3235/68                         |
| des      | 9945             | 8134/4.25              | 7510/118                        | 7513/119                        |
| C7552    | 7919             | 5464/2.35              | 5754/164                        | 5772/164                        |

## Conclusion

Synthesis procedure for domino logic discussed

Technology mapper: fast, good solutions

.

Partitioning between static and domino to gain advantages of both

Placed into a flow including transistor sizing and noise fixes for charge sharing

39