In the FPGA system design, must achieve the performance maximization to need to balance has the mixing property energy efficiency primary device, constructs on (fabric), the piece including logic the memory, DSP and the I/O band width. In this paper, I to you will explain how can profits in the pursue higher system-level performance’s process from Xilinx® Virtex™-5 the FPGA construction module, specially new ExpressFabric™ technology. Take aims at logical and the arithmetic function quantification anticipated performance improvement as the example, I will inquire into the ExpressFabric construction the major function. Based on the actual customer design’s datum will show that the Virtex-5ExpressFabric technical performance average wants high 30% compared to preceding generation of Virtex-4 FPGA.
(You may realize such as counter, accumulator and RAM/ROM memory using the new logical structure in inside) and the available hard IP module, the memory and DSP (moves after optimization is reaching as high as 550MHz the clock rate), Virtex-5 FPGA is without doubt aims at the high performance design the platform choice.
ExpressFabric performance
Has been published since the mid-1980s first FPGA, the majority FPGA logical structures have been based on the same basic four input search table (LUT) construction. The Virtex-5 family is first provides the completely independent input (sharing) true 6 inputs LUT(6-LUT) structure the FPGA platform (Figure 1). Was 65nm Virtex-5 the FPGA family has provided to the 6-LUT structure construction shift, in the critical path retarded - the decision logic structural behaviour factor - and between the bare piece area most effective compromised.

Chart 1:Virtex-5 may dispose logic module (CLB) to constitute two logical pieces, each logical piece uses 4 to provide the less logical level advantage independent 6-LUT.
Along with processing technology’s progress, the interconnection time delay can account for above 50% which the critical path retards. The match spirit thought already was Virtex-5 FPGA has developed the new interconnection pattern, arrived at more places through few caper (hops) to strengthen the performance. Within the new pattern increased two to the logical interconnection quantity which three jump may arrive. In addition, more order route pattern made the Xilinx ISE™ software to be easier to find optimized the route. All interconnect function to project engineers FPGA is transparent, but, will transform into a higher overall performance and easier to design may the route. Essentially, the Virtex-5 pattern provides the route which according to the distance fast may forecast.
Through the new 6-LUT structure and the carrying chain, the special-purpose multi-channel selector and the trigger (with these unit connection unique method) such special function unified, has created the very remarkable performance and realizes logical and the arithmetic function efficiency.
Multi-channel selector (MUX) explains one of clearly ExpressFabric technology advantage examples. Realizes 4:1 MUX in the Virtex-4 construction to need two 4 to input LUT and a MUXF module; 4:1 MUX similarly may realize now in the Virtex-5 component with LUT. , Realizes 8:1 MUX in the Virtex-4 construction to need four LUT and three MUXF modules similarly; But the new Virtex-5 construction needs two 6-LUT merely, therefore, the performance is higher, and the logical use is better, as shown in Figure 2.

Figure 2: Virtex-5 FPGA and Virtex-4 FPGA, in 8:1 multi-channel selector realizes on comparison.
Family was the same with past Xilinx the FPGA, Virtex-5 Slice L (logical piece) used the special-purpose carrying chain to be possible to realize the logical function, the register and the arithmetic function. More complex Slice M (memory piece) enhanced slightly has realized distributional RAM and shift register’s performance in the LUT(SRL) interior.
Various improvements which provides by the ExpressFabric construction, new carrying chain structure (carry chain structure), when uses in realizes the arithmetic operation materially producing a higher performance, it the influence which retards to the critical path may in Table 1 certain examples see easily.

Table 1:Virtex-5 FPGA and Virtex-4 FPGA in realize in the arithmetic function comparison.
Distributional memory function like LUT RAM or ROM also by certain ways from big LUT structure benefit. The new beam-to-depth ratio allowed that packs the small memory function crowded, thus courses the significant performance advantage, like Table 2 describe.

Table 2:Virtex-5 FPGA and Virtex-4 FPGA in realize based on the LUT RAM/ROM on comparison.
By improved the performance which has the 6-LUT construction logical structure and the interconnection structure enhances to have the significant value, but, this is a beginning merely.
Mostly the application needs on the piece RAM can provide based on LUT RAM must be more. Using Virtex-5 which strengthens module RAM, you may realize on a higher piece the memory performance.
Module RAM performance
Along with to the 65nm shift, Virtex-5 the module RAM clock speed has obtained 10% promotion, achieves 550MHz. However, to realize the performance which at present mostly applies longed for, module RAM needs is not only the speed is quicker, but is needs the scale to be bigger.
Virtex-5 the module RAM scale already doubled to 36 kb. This big module scale (by two 18kb memory constitution) will support 72 bit data characters in the simple pair of mouth pattern, therefore, enhanced the module RAM band width a time. In addition, Virtex-5 FPGA provides the special-purpose connection, enables you cascade to get up in the module RAM line two neighboring 36kb the module RAM, therefore, realizes a movement in biggest 550MHz the speed 72kb memory.
Enlarges unceasingly the FPGA usability already accelerated to integrate more subsystems in the sole component’s tendency, causes to carry on the connection to many clock territories the necessity to be more common. The Virtex-5 component through provides the integrated logic to simplify nimble and effective FIFO realizes adapts this tendency.
Through this kind of enhancement’s combination, Virtex-5 module RAM provides more piece on memories, easier to construct FIFO, and obtains a higher band width.
DSP performance
As one kind in view of the high performance DSP application’s feasible solution, the FPGA cognition increases is day by day natural. Regardless of takes one kind of association processor or one kind in view of the harsher application request single plane solution, FPGA unceasingly provides the best performance, the power and the cost combination.
In order to meet the need which looks like to the higher DSP performance is insatiably greedy, the match spirit thinks Virtex-5 the DSP performance both resides in the leading position in the clock rate and the precision two aspects; The clock rate already enhanced 550MHz, but the precision already enhanced from 18 x 18 to 25 x 18.
The match spirit thought also aims at the integrating device chain to realize optimized Virtex-5 the DSP48 piece, its formidable performance caused the foundation very effective high performance filter possibly to become. Links in each DSP48 piece input and the output special-purpose route resources permission in a line the random quantity’s piece. This kind of special-purpose route will guarantee in chain’s each DSP48 piece by the full speed movement, but will not consume any structure route or the logical resource, because of other FPGA need. Synthesizes looked that these improvements realized the resources quantity which the general high accuracy function needed to reduce half. For example, regarding 35 x 25 multiplications, uses Virtex-4 FPGA to need four DSP48 pieces; Has the wider DSP module in Virtex-5 FPGA available, realizes this kind of multiplication function, so long as uses half the DSP48 piece.
I/O band width performance
Along with the performance datum’s progress, FPGA can process the data the speed and the component I/O band width around service condition related, it is the mass data can by the removal to the component on and the speed which unloads from the component uses. When uses the exterior memory buffer, the connection must compared to data processing ratio at least quick two times, because the data must write also needs to read in FPGA.
Through both enhances each pin the data rate, and used a bigger seal to increase available I/O the quantity, Virtex-5 FPGA enhanced the Virtex-4 band width. For example, regarding popular memory interface like DDR2 SDRAM, each pin’s band width already enhanced from 534 Mbps to 667 Mbps; Data I/O quantity -, when considers the SSO request - already from 32 increased 576.
Customer design datum
To further appraise the performance improvement which logic structure provides by Virtex-5 the FPGA, we used the ISE software which the match spirit thought to realize a group of customer design.
These designs use VHDL or the Verilog compilation completely. We used the storehouse module or the synthesis reference direct transfer method have realized some elephant memory and the FIFO such special design unit, but many were the use the EDIF module which produced by the CORE Generator™ software realizes.
Regarding these data, we use Synplicity Corporation’s Synplify the Pro tool by the succession drive type execution synthesis, and surveys the performance effectively using close and the reality restraint. Completes this point to guarantee that all special optimizations and the logical duplication have obtained the use.
Is establishes in ISE software’s realization the layout and the route “high” to complete diligently. The clock is increased repeatedly by 5% increases, cannot satisfy the design restraint until this design.
The result was - with the design which realized using Virtex-4 FPGA compares - the erage behavior to enhance 30%, as shown in Figure 3.

Figure 3: According to a set of 74 customer design comparison which software realizes using ISE the 8.2i.
These improve most designs to have the big logical cone; The critical path realizes the big complex logic equality frequently. For example, in ASIC prototype project’s critical path’s massive logic will usually have the very few registers. These type’s design had demonstrated uses Virtex-5 the ExpressFabric technology the major improvement.
In the demonstration moderate improvement’s design, either has the few logical level, either to use the hard IP module or the improve properties carrying chain structure provides the very few opportunities.
Figure 4 through carried on Virtex-5 FPGA to the preceding generation of Virtex-4 FPGA performance improvement the classification to do the summary.
Chart 4:Virtex-5 FPGA performance improvement.
Conclusion
Using its new ExpressFabric technology and to other high performance hard IP and the I/O tight bond, compares with the preceding generation of construction, Virtex-5 the FPGA family has displayed the significant performance promotion.
51 Research and Design, Electronic Engineers website - Embedded Systems, MCU, DSP, EDA, Test and Measurement, Components, Communications, Power, Microelectronics, Semiconductors
Electronic Design and Research - Electronic Engineers website