LUTMIN: FPGA Logic Synthesis with MUX-Based and Cascade Realizations

Similar documents
Sum-of-Generalized Products Expressions Applications and Minimization

On the Complexity of Error Detection Functions for Redundant Residue Number Systems

Analysis and Synthesis of Weighted-Sum Functions

Irredundant Sum-of-Products Expressions - J. T. Butler

An Application of Autocorrelation Functions to Find Linear Decompositions for Incompletely Specified Index Generation Functions

doi: /TCAD

On the Minimization of SOPs for Bi-Decomposable Functions

An Implementation of an Address Generator Using Hash Memories

x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10 x 11 x 12 x 13 x 14 x 15 x 16

On LUT Cascade Realizations of FIR Filters

Detecting Support-Reducing Bound Sets using Two-Cofactor Symmetries 1

Multi-Terminal Multi-Valued Decision Diagrams for Characteristic Function Representing Cluster Decomposition

Large-Scale SOP Minimization Using Decomposition and Functional Properties

Variable Reordering for Reversible Wave Cascades - PRELIMINARY VERSION

PAPER Design Methods of Radix Converters Using Arithmetic Decompositions

Preprint from Workshop Notes, International Workshop on Logic Synthesis (IWLS 97), Tahoe City, California, May 19-21, 1997

A New Method to Express Functional Permissibilities for LUT based FPGAs and Its Applications

A Fast Method to Derive Minimum SOPs for Decomposable Functions

x 1 x 2 x 7 y 2 y 1 = = = WGT7 SB(7,4) SB(7,2) SB(7,1)

WITH rapid growth of traditional FPGA industry, heterogeneous

An Exact Optimization Algorithm for Linear Decomposition of Index Generation Functions

A PACKET CLASSIFIER USING LUT CASCADES BASED ON EVMDDS (K)

On the Minimization of SOPs for Bi-Decomposable Functions

Logic Synthesis of EXOR Projected Sum of Products

A Fast Head-Tail Expression Generator for TCAM Application to Packet Classification

An Efficient Heuristic Algorithm for Linear Decomposition of Index Generation Functions

Decomposition of Multi-Output Boolean Functions - PRELIMINARY VERSION

Hardware to Compute Walsh Coefficients

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving III & Binary Decision Diagrams. Sanjit A. Seshia EECS, UC Berkeley

Design Method for Numerical Function Generators Based on Polynomial Approximation for FPGA Implementation

Index Generation Functions: Minimization Methods

Symmetrical, Dual and Linear Functions and Their Autocorrelation Coefficients

PLA Minimization for Low Power VLSI Designs

On the number of segments needed in a piecewise linear approximation

NUMERICAL FUNCTION GENERATORS USING BILINEAR INTERPOLATION

REMARKS ON THE NUMBER OF LOGIC NETWORKS WITH SAME COMPLEXITY DERIVED FROM SPECTRAL TRANSFORM DECISION DIAGRAMS

A Novel LUT Using Quaternary Logic

Testability of SPP Three-Level Logic Networks

CS470: Computer Architecture. AMD Quad Core

PAPER Minimization of Reversible Wave Cascades

BDD Based Upon Shannon Expansion

Inadmissible Class of Boolean Functions under Stuck-at Faults

Linear Cofactor Relationships in Boolean Functions

Software Engineering 2DA4. Slides 8: Multiplexors and More

13th International Conference on Relational and Algebraic Methods in Computer Science (RAMiCS 13)

Class Website:

326 J. Comput. Sci. & Technol., May 2003, Vol.18, No.3 of single-output FPRM functions. The rest of the paper is organized as follows. In Section 2, p

A New Implicit Graph Based Prime and Essential Prime Computation Technique

Logic BIST. Sungho Kang Yonsei University

Boolean Algebra and Digital Logic 2009, University of Colombo School of Computing

Binary Decision Diagrams and Symbolic Model Checking

On the Number of Products to Represent Interval Functions by SOPs with Four-Valued Variables

A Deep Convolutional Neural Network Based on Nested Residue Number System

USING SAT FOR COMBINATIONAL IMPLEMENTATION CHECKING. Liudmila Cheremisinova, Dmitry Novikov

Multi-Level Logic Optimization. Technology Independent. Thanks to R. Rudell, S. Malik, R. Rutenbar. University of California, Berkeley, CA

Encoding Multi-Valued Functions for Symmetry

An Algorithm for Bi-Decomposition of Logic Functions

Basing Decisions on Sentences in Decision Diagrams

Reversible Function Synthesis with Minimum Garbage Outputs

Hardware Design I Chap. 2 Basis of logical circuit, logical expression, and logical function

MANY PROPERTIES of a Boolean function can be

Anand Raghunathan MSEE 348

PAPER Head-Tail Expressions for Interval Functions

LP Characteristic Vector of Logic Functions Norio Koda Department of Computer Science and Electronic Engineering Tokuyama College of Technology Tokuya

A New Viewpoint on Two-Level Logic Minimization

Simplifying Logic Circuits with Karnaugh Maps

Combinational Equivalence Checking using Boolean Satisfiability and Binary Decision Diagrams

EVMDD-Based Analysis and Diagnosis Methods of Multi-State Systems with Multi-State Components

ELEC Digital Logic Circuits Fall 2014 Sequential Circuits (Chapter 6) Finite State Machines (Ch. 7-10)

For smaller NRE cost For faster time to market For smaller high-volume manufacturing cost For higher performance

Fast FPGA Placement Based on Quantum Model

A Framework for Layout-Level Logic Restructuring. Hosung Leo Kim John Lillis

2009 Spring CS211 Digital Systems & Lab CHAPTER 2: INTRODUCTION TO LOGIC CIRCUITS

Improved ESOP-based Synthesis of Reversible Logic

Derivative Operations for Lattices of Boolean Functions

Combinational Logic Design Combinational Functions and Circuits

EGFC: AN EXACT GLOBAL FAULT COLLAPSING TOOL FOR COMBINATIONAL CIRCUITS

Sequential Equivalence Checking without State Space Traversal

Olivier Coudert Jean Christophe Madre Henri Fraisse. Bull Corporate Research Center. Rue Jean Jaures Les Clayes-sous-bois, FRANCE

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Logic Synthesis for Layout Regularity using Decision Diagrams *

Two-Qubit Quantum Gates to Reduce the Quantum Cost of Reversible Circuit

New Hierarchies of Representations RM97, September MAIN CONTRIBUTIONS OF THIS PAPER Generalizations of the Generalized Kronecker representation

Boolean Matching Using Generalized Reed-Muller Forms

Binary Decision Diagrams

Symbolic Model Checking with ROBDDs

Compact Numerical Function Generators Based on Quadratic Approximation: Architecture and Synthesis Method

Reduced Ordered Binary Decision Diagrams

Computational Boolean Algebra. Pingqiang Zhou ShanghaiTech University

Binary Decision Diagrams

A New Design Method for Unidirectional Circuits

Sequential Equivalence Checking - I

Digital System Design Combinational Logic. Assoc. Prof. Pradondet Nilagupta

Spiral 2-4. Function synthesis with: Muxes (Shannon's Theorem) Memories

Review. EECS Components and Design Techniques for Digital Systems. Lec 06 Minimizing Boolean Logic 9/ Review: Canonical Forms

Regularity and Symmetry as a Base for Efficient Realization of Reversible Logic Circuits

2. Accelerated Computations

Unit 3 Session - 9 Data-Processing Circuits

Department of Electrical & Electronics EE-333 DIGITAL SYSTEMS

Counting Two-State Transition-Tour Sequences

Transcription:

LUTMIN: FPGA Logic Synthesis with MUX-Based and Cascade Realizations Tsutomu Sasao and Alan Mishchenko Dept. of Computer Science and Electronics, Kyushu Institute of Technology, Iizuka 80-80, Japan Dept. of EECS, University California Berkeley, CA, USA Abstract First, this paper considers the number of LUTs to implement logic functions based on MUX-based realization and cascade realization. This is useful to quickly estimate the number of LUTs to implement the functions on a FPGA. Second, this paper shows an algorithm to realize logic functions by -LUTs using cascade and MUX-based realizations. It often produces smaller circuits than previous methods when the number of the input variables is smaller than. Introduction A K-LUT (look-up table) is a module that realizes an arbitrary K-variable function. This paper considers the number of K-LUTs to realize given function. FPGAs with K =or input LUTs are believed to be the most efficient [, ]. However, in the current technology, FPGAs with K =are standard [,, ]. Thus, we have the following: Problem. Given an n-variable (multiple-output) function f, find the minimum number of -LUTs needed to implement f, where n 0. Unfortunately, it is a very hard problem, in general. So, we try to obtain an upper bound by using properties or measures of the function. The property of the function should be quickly detected, such as symmetry, and measures should be quickly calculated. Such measures include the number of variables (i.e., the support size), the C-measure of a function, the number of products in the SOP [], the number of literals in the factored expression [], the number of nodes in the decision diagram [8]. In this paper, we present two types of realizations: MUX-based realization and cascade realization. Also we show various theorems on the number of LUTs. After that, we we show an algorithm to realize logic functions by using -input LUTs (-LUTs). The algorithm is suitable for the functions with up to inputs, and often produces better solutions than previous methods. Due to the page limitation, all proofs of theorems are omitted. MUX-based Realization In this part, we show a universal method to implement an n-variable function, and obtain upper bounds on the number of LUTs. Such bounds are useful when we only know the number of the input variables n. Definition. A multiplexer with a single control input (- MUX) is the selection circuit shown in Fig... It performs the logical operation: g(x, y 0,y )= xy 0 xy. t-mux is shown in Fig... It is a multiplexer with t control inputs (x,...,x t ), and t data inputs (y 0,y,...,y t ). Let g(x,...,x t,y 0,y,...,y t ) be the output function. Then, g = y a when the decimal representation of the control input (x,...,x t ) is a. That is, when the control input is (0, 0,...,0), the top data input y 0 is selected. When the control input is (0, 0,...,), the second data input y is selected. Also, when the control input is (,,...,), the last data input y t is selected. Lemma. An n-mux is realized by using n copies of -MUXs. Example. A -MUX is realized with =7copies of -MUXs as shown in Fig... (End of Example) Definition. Let B = {0, }. Let X =(x,x,...,x t ). Then, we can consider that X takes its value from P =

x x x y 0 y g = xy 0 xy Figure.. -MUX. y 0 y x x k y 0 y y y y y y y 7 x y k - Figure.. Realization of -MUX by using - MUXs. Figure.. K-MUX. {0,,..., t }. Let S be a subset (S P ) of P. Then, X S is a literal of X. When X S, X S =, and when X / S, X S = 0. Let S i P i (i = S,,...,n), then X S S X X n n is a logical product. (S X,S,...,S n) S S S X X n n is a sum-of-products S expression (SOP). When S i = P i, X i i =and the logical product is independent of X i. In this case, literal X i i P is redundant and can be deleted. A logical product is also called a term, oraproduct term. Theorem. An arbitrary n-variable function is expanded as follows: f(x,x )= g i (X )X, i i P where X = (x,x,...,x k ) and X = (x k+,x k+,...,x n ), P = {0,,..., n k }, and the OR is performed with respect to n k elements. Example. Consider the realization of a 7-variable function f(x,x ), where X =(x,x,...,x ), and X = (x,x 7 ). f is expanded into a sum of four products: f(x,x )= i=0 g i(x )X i = g 0 (X )X 0 g (X )X g (X )X g (X )X. As shown in Fig.., -MUX can be implemented by using three copies of -MUXs. The top LUT in the leftmost column realizes g 0, which is selected when (x,x 7 )=(0, 0). The second LUT in the leftmost column realizes g,which is selected when (x,x 7 )=(0, ). Other LUTs are derived similarly. (End of Example) Theorem. When K n, an arbitrary n-variable function is realized by using at most n K copies of -MUXs, and n K copies of K-LUTs. X X X X g 0 g g g x -MUX Figure.. Realization of an arbitrary 7- variable function using -LUTs. To implement -MUX, we have more efficient realization than Fig... When K =, a -MUX can be realized by a single -LUT instead of using three copies of -MUXs. Lemma. An arbitrary function of n = K +variables can be implemented with at most three K-LUTs, where K. Lemma. An arbitrary function of n = K +variables can be implemented with at most five K-LUTs, where K. Theorem. [9, ] The number of -LUTs to implement an arbitrary n-variable function f is ( n )/ or less, when n is even, and ( n +)/ or less, when n is odd. x7

-MUX -MUX Figure.. Realization of an arbitrary 8- variable function using -LUTs. -MUX x7 x8 -MUX -MUX x9 -MUX Figure.7. Realization of an arbitrary 0- variable function using -LUTs. X =(x,x ) 00 0 0 00 0 0 0 X =(x,x ) 0 0 0 0 0 0 0 0 0 0 0 Figure.. Realization of an arbitrary 9- variable function using -LUTs. Example. The number of -LUTs to implement an n- variable function is or less, when n = 8. In this case, g i (X )(i = 0,,,, ) are implemented by four copies of -LUTs, while the -MUX is implemented by a single -LUT, as shown in Fig... In the figure, the numbers in squares denote the numbers of necessary LUTs. or less, when n =9. In this case, g i (X )(i = 0,,,...,7) are implemented by 8 copies of -LUTs, while the -MUX is implemented by three -LUTs as shown in Fig... or less, when n =0. In this case, g i (X )(i = 0,,,...,) are implemented by copies of - LUTs, while the -MUX is implemented by using five -LUTs as shown in Fig..7. (End of Example) Cascade Realization In this part, we show a cascade realization of a logic function. This gives more compact circuits than the MUXbased realization, but is only applicable to function with special properties. Figure.. Decomposition table of an logic function.. Functional Decomposition Definition. [] Let f(x) be a logic function, and (X,X ) be a partition of the input variables, where X = (x,x,...,x k ) and X =(x k+,x k+,...,x n ). The decomposition table for f is a two-dimensional matrix with k columns and n k rows, where each column and row is labeled by a unique binary code, and each element corresponds to the truth value of f. The function represented by a column is a column function. Variables in X are bound variables, while variables in X are free variables. Inthe decomposition table, the column multiplicity denoted by µ k is the number of different column patterns. Example. Fig.. shows a decomposition table of a - variable function. Since all the column patterns are different, the column multiplicity is µ =. (End of Example) Theorem. [] For a given function f, letx be the bound variables, let X be the free variables, and let µ k be the column multiplicity of the decomposition table. Then, the function f can be realized with the network shown in Fig... In this case, the number of signal lines connecting two blocks H and G is log µ k. When the number of signal lines connecting two blocks is smaller than the number of input variables in X, we can often reduce the total amount of memory by the realization in

X H X Figure.. Realization of a logic function by decomposition. G f X X Figure.. Realization of a 9-variable function with C-measure at most 8. Fig.. [7]. Such technique is call a Curtis decomposition [], in this paper.. C-Measure Definition. Let f(x,x,...,x n ) be a logic function. Let µ k be the column multiplicity of the decomposition table for f(x,x ), where X = (x,x,...,x k ) and X =(x k+,...,x n ). The C-Measure of the function f is max(µ,µ,...,µ n ), and denoted by µ(f). By repeatedly applying functional decompositions to a given function, we have an LUT cascade [0] shown in Fig... An LUT cascade consists of cells, and the signal lines connecting adjacent cell are rails. A logic function with a small C-measure can be realized by a compact LUT cascade. To obtain the C-measure, a decomposition table is not necessary. A quasi-reduced binary decision diagram (QRBDD) that represents the logic function directly shows the C-measure: The C-measure is equal to the maximum width of the QRBDD, where the variable ordering is (x,x,...,x n ) [7].. LUT Cascade Lemma. [0] Let µ(f) be the C-measure of a function f. Then, f can be implemented by an LUT cascade, whose cells have at most log µ(f) +inputs, and log µ(f) outputs. Lemma. [] In an LUT cascade that realizes the function f, letn be the number of the input variables; s be the number of cells; w = log µ(f) be the maximum number of rails; K be the number of inputs for a cell; n K +; Figure.. LUT cascade. f and K log µ(f) +. Then, an LUT cascade satisfying the following condition exists: n w s. K w. Bounds for Functions with Small C- Measure From Lemmas. and., we can derive the followings: Theorem. [] The number of -LUTs to implement an arbitrary n-variable function, where n 8 is:. n or less, when µ(f).. n or less, when µ(f).. n or less, when µ(f) 8(n =r).. n or less, when µ(f) 8(n r). Example. Let f(x,x ) be a 9-variable function, where X =(x,x,...,x ) and X =(x 7,x 8,x 9 ). Let µ be the column multiplicity of the decomposition of f with respect to (X,X ).Ifµ 8, then f(x,x ) can be implemented with four copies of -LUTs, as shown in Fig... To check that a given 9-variable function can be realized as Fig. (., we need to check the column multiplicities for only 9 ) =combinations.. Bound for Symmetric Functions When the given function is symmetric, it can be implemented more efficiently than a general function [7, ]. Efficient algorithms to detect symmetric functions exist, e.g., []. Lemma. Let f be a symmetric function of n-variables. Then, µ(f) n +. Theorem. The number of K-LUTs to implement an n- variable symmetric function is

Figure.. Realization of a symmetric function of variables by -LUTs. Figure.. Realization of a symmetric function of variables by -LUTs.. or less, when n =9and K =. Fig.. shows the realization.. 7 or less, when n =and K =. Fig.. shows the realization.. or less, when n =and K =. Fig.. shows the realization. Logic Synthesis with -LUTs For some class of functions, the LUT cascade realizations require many fewer LUTs than other methods. Current version of the program works for only small problems. The algorithm obtains the column multiplicity µ and calculate the number of cells in the LUT cascade. When log (µ) K, the algorithm uses a MUX-based realization. The outline of the method is as follows: Algorithm. (LUTMIN:Logic synthesis using -LUTs). For a multi-output function, decompose it into singleoutput functions.. For each single-output function F : (a) Minimize support of F by removing redundant variables. (b) If the support size of F is less than or equal to K (LUT size), implement it using one K-LUT. (c) If the support size of F is exactly K +, the best we can do is to realize with by two K-LUTs. In this case, we check cofactors of F w.r.t. each variable. If the support size of one cofactor is K or less, implement F using two K-LUTs composed of the larger cofactor in one LUT, and smaller cofactor and MUX in another LUT. (d) If F is not decomposed in step (.b) and (.c), reorder BDD of F to minimize the number of nodes. (e) Consider M = log (µ), where µ is column multiplicity of F w.r.t. the bound-set composed of K variables on top of the BDD. (f) If M < K, use Curtis decomposition w.r.t. K topmost variables as the bound set. Implement M bound set nodes using K-LUTs and perform recursive decomposition of the composition function whose support size is equal to: Support size(f ) K + M. (g) If M K, create -MUX w.r.t. to the two topmost variables in the BDD, and recursively decompose four cofactors.. As a last post-processing step, detect and merge functionally equivalent nodes in the resulting LUT network. (Such nodes could be created in step (.g) when, for example, two of the four cofactors of F are equal.) Example. Consider sym [9], a symmetric function of variables. sym is iff the number of s in the inputs is between and 8.. The column multiplicity of the function with respect to the bound set composed of K =variables is µ =7. Thus, M = log 7 =.. Since M < K, we use Curtis decomposition with respect to the topmost variables in the bound set: x,x,...,x. Realize the first cell using -LUTs, which corresponds to the leftmost cell in Fig.... The composition function has -+=9 variables. The top of the variables are three outputs of the leftmost cell, and x 7,x 8,x 9. In this case, the column multiplicity is µ 9 =8. Thus, M = log 8 =.. Since M<K, we use Curtis decomposition with respect to the topmost variables in the bound set: three outputs of the leftmost cell, and x 7,x 8,x 9. Realize the second cell using -LUTs, which corresponds to the middle cell in Fig...

xⅹ x x7 x8 x9 x0 x x sym Figure.. sym implemented by -LUTs. Table.. Number of -LUTs realize functions. MBC LUT MIN SAS Name PI PO LUT lev LUT lev LUT lev alu 8 8 08 8 amd 8 09 9 apex 9 9 87 chkn 9 7 7 7 0 8 cordic 9 cps 09 7 70 9 ex00 0 0 98 9 0 exep 0 0 0 8 in 0 0 0 0 98 intb 7 9 7 7 9 misex 0 misj 7 9 pdc 0 7 0 spla 8 0 ti 8 7 7 9 8 97 8 tial 8 08 89 8. The composition function has 9-+= variables. Since the support size is equal to K=, implement the function by a -LUT, which corresponds to the rightmost cell in Fig.... In this way, sym is realized by ++=7LUTs of -inputs. Note that this realization is different from Fig... Experimental Results (End of Example) Algorithm. was implemented in ABC [] as command lutmin, applicable to logic networks in ABC. The implementation was tested on benchmarks from Espresso/MCNC benchmark suites. The experiments were run on the laptop with Intel Core Duo U900.GHz CPU with Gb RAM (only a few Mb of RAM was used). The total runtime for benchmarks was 8 sec. The resulting LUT networks were verified using SAT-based combinational equivalence checking command cec in ABC. Table. shows the numbers of LUTs and logic levels when K =. The columns headed with MBC show the results of improved version of Mishchenko et al. [0]. The columns headed with LUTMIN show the results obtained by lutmin (Algorithm.). When the number of inputs is less than, Algorithm. often produced better solutions than Mishchenko et al.[0]. The columns headed with SAS shows the upper bounds obtained by using additional idea. (It obtains only the number of LUTs to show the possibility of further improvement. ) In Manohararajah et al. [8] and Mishchenko et al. [0], numbers of -LUTs to implement some benchmark functions are shown. For example, ex00, a 0-input 0-output function. Example. shows that to implement any 0 variable function, LUTs are sufficient. Thus, to implement ex00, which is a 0-output function, requires at most 0 = 0 LUTs. Note that 9 LUTs are used in [8] and 09 LUTs are used in [0]. apex is a 9 input 9-output function. Example. shows that to implement any 9 variable function, LUTs is sufficient. Thus, apex, which is 9-output function, requires at most 9 = 09 LUTs. Note that 78 LUTs are used in [8] and 7 LUTs are used in [0]. These results show that the even the state-of-the-art logic synthesis algorithms still have room for improvements, which is also noted by Cong and Minkovich []. The reason why [0] failed to obtain the optimum solutions are as follows:. [0] uses AND inverter graphs as an initial solutions, which can lead to local minimal solutions.. [0] uses heuristic method to design larger circuits, and some optimization is omitted to save computation time. It should be noted that [0] is applicable to large circuits, while Algorithm. is applicable to only small problems.. Conclusion and Comments In this paper, we consider the number of -LUTs to implement logic functions. Also presented a synthesis method with -LUTs. The method is MUX-based realizations and cascade-based realizations. Although, this method produces circuits with more levels and is practical for the functions with at most inputs, it often produces circuits with fewer LUTs than existing methods [0, 8]. A possible extension is to optimize encodings of the rail outputs [9]. By considering the encodings, we can often replace one or more outputs functions of LUTs by primary input variables. In such a case, the number of LUTs can be further reduced. Another extension is to extract decomposable component whose number of inputs is at most.

In this paper, to implement the multiple-output functions, functions are decomposed into single-output ones. However, decomposition into groups of a few functions also produces very good circuits []. Yet another possible extension is to use such partitions of the outputs. 7. Acknowledgments This research is supported in part by the Grants in Aid for Scientific Research of JSPS, and the Grant of Knowledge Cluster Project of MEXT. Discussion with Prof. R. K. Brayton was quite useful. References [] Altera, Stratix IV FPGA Core Fabric Architecture, http://www.altera.com/ [] E. Ahmed and J. Rose, The effect of LUT and cluster size on deep-submicron FPGA performance and density, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol., No., March 00, pp.88-98. [] R. L. Ashenhurst, The decomposition of switching functions, International Symposium on the Theory of Switching,pp. 7-, April 97. [] Berkeley Logic Synthesis and Verification Group, ABC: A System for Sequential Synthesis and Verification, Release 709. http://www.eecs.berkeley.edu/ alanmi/abc/ [] J. Cong and K. Minkovich, Optimality study of logic synthesis for LUT-Based FPGAs, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol., Issue, Feb. 007, pp.0-9. [] H.A.Curtis,ANew Approach to the Design of Switching Circuits, D. Van Nostrand Co., Princeton, NJ, 9. [7] V. N. Kravets, K. A. Sakallah, Constructive libraryaware synthesis using symmetries, DATE 000, pp.08-, March 000. [8] V. Manohararajah, S. D. Brown, and Z. G. Vranesic, Heuristics for area minimization in LUT-based FPGA technology mapping, IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, Vol., No., November 00, pp. -0. [9] A. Mishchenko and T. Sasao, Encoding of Boolean functions and its application to LUT cascade synthesis, International Workshop on Logic and Synthesis (IWLS00), New Orleans, Louisiana, June -7, 00, pp.-0. [0] A. Mishchenko, R. K. Brayton, and S. Chatterjee, Boolean factoring and decomposition of logic networks, Proc. ICCAD 08, Nov. 008, pp.8-. [] R. Murgai, R. Brayton, and A. Sangiovanni Vincentelli, Logic Synthesis for Field-Programmable Gate Arrays, Springer, July 99. [] H. Nakahara and T. Sasao, A PC-based logic simulator using a look-up table cascade emulator, IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E89-A, No., Dec. 00, pp. 7-8. [] S. Panda, F. Somenzi, and B. F. Plessier, Symmetry detection and dynamic variable ordering of decision diagrams, ICCAD99, Nov. 99, pp.8-. [] C. A. Papachristou, Characteristic measures of switching functions, Information Science, Vol.,No., pp.- 7, 977. [] J. Rose, R. J. Francis, D. Lewis, and P. Chow, Architecture of field programmable gate arrays: The effect of logic block functionality on area efficiency, IEEE J. Solid State Circ.,, pp. 7-, Oct. 990. [] J. Rose, A. El Gamal, and A. Sangiovanni-Vincentelli, Architecture of field-programmable gate arrays, Proc. IEEE,Vol. 8, No. 7, pp.0-09, July 99. [7] T. Sasao, FPGA design by generalized functional decomposition, In Logic Synthesis and Optimization, Sasao ed., Kluwer Academic Publisher, pp. -8, 99. [8] T. Sasao and J. T. Butler, A Design method for look-up table type FPGA by pseudo-kronecker expansion, IEEE International Symposium on Multiple-Valued Logic, Boston, May 99, pp. 97-0. [9] T. Sasao, Switching Theory for Logic Synthesis, Kluwer Academic Publishers, 999. [0] T. Sasao, M. Matsuura, and Y. Iguchi, A cascade realization of multiple-output function for reconfigurable hardware, International Workshop on Logic and Synthesis (IWLS0), Lake Tahoe, CA, June -, 00, pp. - 0. [] T. Sasao, Analysis and synthesis of weighted-sum functions, IEEE TCAD, Special issue on International Workshop on Logic and Synthesis, Vol., No., May 00, pp. 789-79. [] T. Sasao, A New expansion of symmetric functions and their application to non-disjoint functional decompositions for LUT-type FPGAs, International Workshop on Logic Synthesis, Dana Point, California, U.S.A., May -June, 000, pp.0-0. [] T. Sasao, On the number of LUTs to realize sparse logic functions, International Workshop on Logic and Synthesis IWLS-009, July -Aug., 009, Berkeley, U.S.A.. [] Xilinx, Advantages of the Virtex- FPGA, -Input LUT Architecture, WP8 (v.0), Dec. 9, 007, http://www.xilinx.com/