• Uses the new SRAM craft to realize embedded ASIC and the SoC memory design

    The sharp weapon which has been in many embedded designs uses ASIC/SoC based on the traditional six transistor (6T) memory cell’s static RAM memory block to realize the development personnel who uses, because of this kind of memory structure very suitable mainstream’s CMOS technical process, not to need to increase any extra craft step.

    Like in chart 1a shows such, interwove basically the coupling latch and the active load unit has composed the 6T memory cell, this kind of unit might use in the capacity from several to several megabit memory arrays.

    May satisfy many different performance requirements after the careful design’s this kind of memory array, the specific request is decided by the designer whether to select has optimized in view of the high performance or the low power the CMOS craft. High performance craft production SRAM block access time when the 130nm craft may with ease be lower than 5ns, but the low power craft production’s memory block’s access time must be bigger than 10ns generally.

    Memory cell’s static characteristic causes the auxiliary circuit which needs to be very few, only needs the address decoding and enables the signal to be possible to design the decoder, to examine the electric circuit and the sequence circuit.

    Along with a generation of more advanced craft node’s development, component’s characteristic size getting smaller, use tradition six transistor memory cell manufacture’s static RAM may provide more and more short access time and the unit size getting smaller, but leaks the electric current and actually assumes the trend of escalation to the slowover failure sensitivity, the designer must increase the extra electric circuit to reduce leaks the electric current, and provides the failure detection and the correction mechanism comes “the cleaning” memory’s slowover failure.

    Current 6T SoC RAM unit limitation

    However, uses for to compose the latch and the high performance load six transistors causes the 6T unit size to be very big, thus limited enormously has been possible the storage capacity which realized in the memory array.

    This kind of limit’s principal factor is the memory block consumption area as well as, because uses in the unit leakage which realizes the chip design technical craft node (130,90,65nm) to cause. Occupies the entire chip area along with memory array’s total area the ratio to increase, the chip size and the cost are also getting bigger and bigger.

    Leaks the electric current also possibly to surpass the entire power budget or to limit the 6T unit in portable equipment application. A bigger area or the high leakage chip possibly are finally unable to satisfy the application the target price request, therefore is unable into an economical solution.

    As 6T RAM unit substitution technology 1T unit

    Requests on the large capacity piece to these to save (usually is bigger than 256kb), but did not request the absolute minimum access time the application also to have other one kind of solution technology. This kind of solution uses memory array function similar SRAM, what but based on is in similar dynamic RAM uses the unit crystal siphuncle/single electric capacity (1T) storage location (chart 1b).

    Chart 1a:Typical six transistor static state RAM memory cell.Chart 1b:Typical unit crystal siphuncle/single electric capacity dynamic memory memory cell.
    Chart 1a: Typical six transistor static state RAM memory cell. Chart 1b: Typical unit crystal siphuncle/single electric capacity dynamic memory memory cell.

    This kind of memory array may achieve the 6T memory array in the same chip area’s density 2 to 3 times. When the embedded memory request surpasses several megabits may use the simple dynamic RAM array, but this kind of array request central controller and logic understood that memory’s dynamic characteristic, and provides the refurbishing control and the succession signal correctly.

    Inserts the simple DRAM memory block other one method is the DRAM array and it own controller ties up in together, causes the SRAM array which it looks like is likely easy to use. And provided the refurbishing signal through the conformity high density 1T memory cell some support logic, the memory cell dynamic characteristic is cannot see to the ASIC/SoC designer, the designer when realized ASIC and the SoC solution might treat as them the static RAM use (Figure 2).

    Around the chart 2:DRAM memory array increases the control and connection support logic cause this array to use likely static RAM, therefore may enhance the memory density.
    Around the chart 2:DRAM memory array increases the control and connection support logic cause this array to use likely static RAM, therefore may enhance the memory density.

    Some companies and a generation of factory already developed the 1T unit also needs the extra mask level besides the standard CMOS level. Therefore this method increased the wafer cost, and is closely related with the concrete generation of factory, can only the process of manufacture limit in the specific generation of factory. In order to close the extra wafer processing cost, in the chip uses the total DRAM array size must surpass the bare piece area generally more than 50%. Moreover, majority of DRAM great is the size, the length and breadth compared to receives with the connection limits hardly great.

    The SoC design needs to have the performance-to-price ratio IP great, according to the cost or the capacity need, these IP great may make conveniently in any generation of factory, or shifts from a generation of factory to another generation of factory. In the domain and the disposition stage, this kind great can also provide more flexibilities to the ASIC designer.

    Many generation of factories have this kind so-called “unit crystal siphuncle SRAM” the technology, and as the intellectual property rights which may be authorized. Such one kind saw take the compiler as the leading method uses in bulk in the CMOS craft, because does not have the extra mask step, therefore may reduce 15-20% wafer costs, and may reduce the product going on the market time.

    Regarding system’s other parts, the memory block connection which the above method forms look like likely are static RAM, but with uses the 6T unit the memory array to compare, its density (unit area’s figure) may achieve the latter 2 to 3 times (to carry on average after area computation part of support electric circuit expenses). The memory array is bigger, supports the total area which the electric circuit needs to be smaller, the memory block has the higher area efficiency.

    In order to found the ideal memory array, may use looks like the MemQuest such memory compiler tool. These tools allow the designer to realize are colder, quicker or the high density coolSRAM-1T disposition, these dispositions may transplant in the different generation of factory and the technical node (see Figure 3), thus may avoid the non-duplicated construction costs which the artificial array realizes needs.

    Figure 3:The portable coolSRAM-1T design uses in special low power the equipment, it through the adaptive circuit size adjustment, the hypothesized earth, the auto-adapted backward bias and other electric circuit technology reduces leaks the electric current.
    Figure 3: The portable coolSRAM-1T design uses in special low power the equipment, it through the adaptive circuit size adjustment, the hypothesized earth, the auto-adapted backward bias and other electric circuit technology reduces leaks the electric current.

    The compiler may also help the user to use the most superior essence size, the connection and the length and breadth is comparable realizes the shortest going on the market time, and provides memory array’s electricity which, physics, the simulation to the designer it translates (Verilog and VHDL), the test and the comprehensive result.

    In a 1Mb memory array example, for example the coolSRAM-1T disposition, has under the room temperature to leak the electric current for the number microampere, regarding the power line voltage and clock rate this is a typical boundary condition (Figure 3).

    When uses 100kHz or the 100kHz following model refurbishing speed as well as the 128k character ×8 position organizational structure, 1Mb the coolSRAM-1T array has an idle power to be able to cause the data hold time and similar capacity SRAM quite. (the coolSRAM-6T 1Mb example when uses Taiwan Semiconductor Corporation’s 130nm G craft manufacture will take approximately 2.6 square millimeter area, each megacycle per second consumed power is smaller than 100 microwatts)

    Although the SRAM-1T function is similar to SRAM, but the interior actually has the DRAM characteristic -, when uses the 130nm craft realizes, under the room temperature memory cell may maintain data dozens of millisecond time. The support refurbishing control logic provides the refurbishing function transparently, and can act according to the temperature control refurbishing cycle.

    If the designer wants to use SoC to manage refurbishing, may also choose the bypass to fall in memory array’s refurbishing controller, uses from the SoC logic refurbishing signal. This may save in SoC effectively some dynamic power losses, because of the system logic may “on demand”, but is not “automatic” realizes the SRAM-1T embedded refurbishing logic.

    In the SRAM-1T example’s memory cell also supports the sleep and waits for an opportunity the pattern. When sleep pattern, may through suppress the majority of memory array’s clock to come to reduce the power loss enormously.

    When the array “is awakened”, the data must by the reload memory cell. When waits for an opportunity the pattern, the memory enables the data through the use low frequency refurbishing operation to maintain, this time the power loss is very small. When returns to the working pattern, the memory may be operational immediately, the data does not need to load again the memory array.

    The designer can also let the memory array by different good size - 256, 512, 1024 or 2048 through the disposition carries on refurbishing, even realizes at the same time the multi-lines refurbishing. Also the permission designer has a choice refurbishing array small part only maintains the essential data does not lose, simultaneously shuts off array other parts of power supplies.

    To any memory array, the fabrication technology change always has the possibility to cause in the memory array to present 12 bad positions. Such chip not necessarily must abandon, the designer only need increase row and a line of redundant mechanism can enhance the nondefective rate.

    If after chip payment, breaks down position, may use built-from the repair function as well as the disposable programmable coolOTP memory repair memory array. Moreover, the built-in self-checking function may also increase in the memory IP block, it will not affect the chip the performance.

    When memory array’s key property cannot satisfy the system needs, the designer may use some structurized technology from the memory array to obtain a higher performance. However, uses these technical requirement to pay certain price, they will affect the chip the power loss, the size and the complexity, must therefore carry on the tradeoff earnestly, determined that the best memory array and the chip construction combination, can realize the ideal performance and the cost goal like this.

    Uses the wide character construction to the chip construction designer is one kind of good choice, it can organize the memory to provide in the interior 128, 256 or 1024 bit wide data character, then the downward multiplying thinks that the character width which wants (to see Figure 4).

    Figure 4:In the model SoC design, the wide internal memory main line may use for to transmit the graph and in the DSP processing timely data fast.
    Figure 4: In the model SoC design, the wide internal memory main line may use for to transmit the graph and in the DSP processing timely data fast.

    This kind of technology may (apparent clock rate) enhance the apparent clock rate 2 times or 4 times, thus reduces the actual access time, reduces the power loss finally. In this case, because will need the demultiplexing logic to extend the character to reduce to suits SoC other parts of uses the appropriate width character, will have in the area negative effect to the IP design.

    Other one method is divides the memory Cheng Duoge the example (area), and establishes the memory controller, lets it visit these example (instance) alternately by the continual cycle, between such key-in and the area cut may hide some section of access time (to see chart 5a).

    Chart 5a:Through increases some extra controls and the sequence circuit may realize many memory examples (area) the interleaving access, will thus arrive at the main processor's data rate to enhance 2 time, 3 times of 4 times (to be even decided by area quantity).
    Chart 5a: Through increases some extra controls and the sequence circuit may realize many memory examples (area) the interleaving access, will thus arrive at the main processor’s data rate to enhance 2 time, 3 times of 4 times (to be even decided by area quantity).

    In must interweaves in the access system, the memory subsystem must work in the system clock speed, if this time the memory visit cannot the synchronization in the clock, then overall system’s running rate slowly will get down (sees chart 5b).

    Chart 5b:In the non-interleaving access system, the memory area's access time when the reference to storage array limit system clock speed.
    Chart 5b: In the non-interleaving access system, the memory area’s access time when the reference to storage array limit system clock speed.

    But in interweaves in the deposit memory system, the clock rate may 2 time, 3 time, 4 time of promotion, be decided specifically by area quantity. But when interweaves the deposit to surpass two zone times, the system complexity will have quite big increase.

    Regarding the double area system, the clock rate may be maximum speed 2 times which each memory block may process, but because each example is by the clock rate half circulation, the single area cannot feel the clock speed change (to see chart 5c) at the appointed time.

    Chart 5c:In interleaving access's multi-area system, the clock speed may achieve the non-interleaving access clock speed the several fold (clock x area quantity).
    Chart 5c: In interleaving access’s multi-area system, the clock speed may achieve the non-interleaving access clock speed the several fold (clock x area quantity).

    Moreover, regarding memory block some overall situation logic by double in memory speed movement, and in alternately clock cycle to two area in each area transmission address messages. This kind of overall situation logic may share in many areas, thus may save the area and the power.

    Data feeds/output port’s additional logic carries on the multiplying or the demultiplexing to the data, and provides the data to the main engine system by the double data rate, or input rate’s half provide the data to the memory block. Therefore the memory subsystem’s effective volume of goods handled enhanced one time, but the effective power ratio two time of storage capacity’s single block must be low.

    Although this method may reduce the access time nearly 50%, but has also brought the extra support electric circuit and the design/succession complexity. This time generally must retard to memory’s data accessing one cycle (monocycle detention visit), and the visit is accurate random, the system is unable in each cyclical visit same internal area.

    Share/Save/Bookmark

    Tuesday, November 11th, 2008 at 07:49
No comments yet.

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

TOP
Copyright © 51 Research and Design, Electronic Engineers website - Embedded Systems, MCU, DSP, EDA, Test and Measurement, Components, Communications, Power, Microelectronics, Semiconductors
Powered by WordPress | Theme by mg12 | Valid XHTML 1.1 and CSS 3