US20090172215A1 - Even and odd frame combination data path architecture - Google Patents
Even and odd frame combination data path architecture Download PDFInfo
- Publication number
- US20090172215A1 US20090172215A1 US12/006,247 US624707A US2009172215A1 US 20090172215 A1 US20090172215 A1 US 20090172215A1 US 624707 A US624707 A US 624707A US 2009172215 A1 US2009172215 A1 US 2009172215A1
- Authority
- US
- United States
- Prior art keywords
- parallel data
- data
- memory
- piso
- logic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/40—Bus structure
- G06F13/4004—Coupling between buses
- G06F13/4027—Coupling between buses using bus bridges
- G06F13/405—Coupling between buses using bus bridges where the bridge performs a synchronising function
- G06F13/4059—Coupling between buses using bus bridges where the bridge performs a synchronising function where the synchronisation uses buffers, e.g. for speed matching between buses
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present disclosure generally relates to the field of electronics. More particularly, an embodiment of the invention relates to an even and odd frame combination data path architecture.
- processors increase their processing capabilities, one concern is the speed at which a main memory may be accessed by a processor. For example, to process data, a processor may need to first fetch data from a main memory. After completion of the processing, the results may need to be stored in the main memory. To improve performance, some processors may have access to a cache that temporarily stores the data. However, cache sizes are generally much smaller than a main memory. Thus, speed of an interface between a processor and a main memory may be a critical factor in overall computing performance.
- FIG. 1A illustrates odd and even frame combination data path architectures, according to some embodiments of invention.
- FIG. 1B illustrates format of data and/or commands that may be written by the drivers shown in FIG. 1A .
- FIG. 2 illustrates an example of a transmit (Tx) Input/output (IO) buffer, in accordance with an embodiment.
- FIGS. 3-5 illustrate various information relating to differential mode implementation of even and odd frame combination data path architectures, according to some embodiments.
- FIGS. 6-8 illustrate various information relating to single ended mode implementation of even and odd frame combination data path architectures, according to some embodiments.
- FIGS. 9-11 illustrate various information relating to Command/Address (CA) for single ended mode implementation of even and odd frame combination data path architectures, according to some embodiments.
- CA Command/Address
- FIG. 12 illustrates a block diagram of a computing system in accordance with an embodiment of the invention.
- Some of the embodiments discussed herein relate to even and odd frame combination data path architectures.
- the techniques discussed here may be applied to a memory interface provided between a processor and a main memory.
- parallel data may be received from a source, stored in a buffer (such as the FIFOs discussed with reference to FIGS. 1-11 ).
- One or more serial bit streams may be generated based on the parallel data.
- the same architecture may be used for handling both single ended signals and differential signals.
- some embodiments may be provided in various environments, such as those discussed herein with reference to FIG. 12 , for example.
- FIG. 1A illustrates odd and even frame combination data path architectures, according to some embodiments of invention.
- the architecture shown in FIG. 1A may be utilized for a combined differential (e.g., 9 UI (Unit Interval)) and single ended (SE) (e.g., DDR (Double Data Rate) or GDDR (Graphics DDR)) 2UI or 4UI memory) interface.
- the interface may be provided between a transmission source 102 (e.g., a processor) and a memory 104 (such as the main memory discussed with reference to FIG. 12 , for example).
- a transmission source 102 e.g., a processor
- a memory 104 such as the main memory discussed with reference to FIG. 12 , for example.
- FIG. 1B illustrates format of data and/or commands that may be written by the corresponding drivers shown in FIG. 1A .
- the top portion of FIG. 1A illustrates a differential mode eCA (embedded Command/Address data) combination driver 106 A with 9-UI frame size (e.g., as shown in the top portion of FIG. 1B ).
- the bottom portion of FIG. 1A illustrates a single ended mode configuration of the driver shown in the top portion of FIG. 1A .
- the single ended driver 106 B may provide separate data and command/address (e.g., as shown in the bottom portion of FIG.
- a single differential driver may be multiplexed with two single ended (e.g., DDR) drivers depending on chip's operating mode, e.g., by utilizing a multiplexer 108 .
- FIG. 2 illustrates an example of a transmit (Tx) Input/output (IO) buffer 200 , in accordance with an embodiment.
- the buffer 200 may receive parallel (e.g., low speed) data 202 from a processor core and transfer the received data to a high speed IO clock (Clk) domain (e.g., using a FIFO (First In, First Out) buffer illustrated in box 204 ) with a PISO (Parallel Input, Serial Output) (shown in box 204 ) converting the transferred data to serial bit streams.
- parallel e.g., low speed
- Clk high speed IO clock
- FIFO First In, First Out
- PISO Parallel Input, Serial Output
- the output of FIFO-PISO logic 204 may pass through a multiplexer 206 (e.g., to serialize the output signals from the logic 204 in accordance with a transmit clock labeled as TxClkxx) and a driver 208 before being driven out to external world.
- the PISO may include the multiplexor 206 .
- FIG. 2 illustrates a transmit data path at a high level, in accordance with one embodiment.
- low speed parallel data driven from core 202 (e.g., from low speed core clock rate, such as 1/9UI rate or frame rate) to IO along with source sync (SS) clock (referred to herein sometimes as “clk” or “Clk”) (or could be without any SS clock, depending on the implementation, could be just on die wave-pipelined, etc.).
- SS source sync
- Data may be first transferred to IO clock domain using a FIFO (e.g., stretching incoming data to absorb uncertainty between clock domains, and routing/physical difference within that data byte or signals group), and then parallel to serial conversion is done using the PISO, e.g., running at high speed IO clock (here IO clock is shown is half rate, 2-UI period) (generally odd-even data pipe). Transmit serial data is then driven out to pad/channel using the driver 208 .
- SS Source Sync
- FIFO may be omitted from subsequent figures.
- FIGS. 3-5 illustrate various information relating to differential mode implementation of even and odd frame combination data path architectures, according to some embodiments.
- parallel data is sent form a core 302 to IO 304 using 9-wires (e.g., at frame rate 1/9th) through routing wires and/or buffers 306 .
- PISO 400 may convert parallel data into serial data streams (e.g., even-odd steams and then multiplexed out).
- PISO 400 may include an even bank (e.g., labeled with 6 , 4 , 2 , and 0 in FIG. 4 ) and an odd bank (e.g., labeled with 7 , 5 , 3 , and 1 in FIG. 4 ) of storage devices (such as edge-triggered latches) in some embodiments.
- FIG. 5 illustrates a timing diagram, according to an embodiment of the invention.
- a multiplexor (also referred to herein as “Mux”) select (Mux Sel) in core may swap position of odd-even bits (here shown for data bit B( 0 : 8 )) for odd frames (in core), and a “load” signal (“Ld”) in IO loads data into PISO, and then serially transmits data to pad.
- the PISO may serialize the data.
- the PISO converts 9-parallel data bits into two even-odd bit streams, which are than multiplexed and driven out to the pad.
- the core in differential mode (e.g., 9UI frame), the core may send out parallel low speed data to IO using 9 wires running at frame rate (e.g., 1/9th rate of pad data).
- Low speed parallel core data may be first loaded into PISO in IO running at high-speed local IO transmit clock (with a local “load” signal, where load generally defines a safe window for incoming data, for example, a signal running at frame rate 1/9th; also, the load position is programmable in some embodiments).
- position of even and odd bits on subsequent frames may be swapped (in alternate frames), such as shown in FIG. 5 .
- the swapping may be done using a multiplexer in the core as shown in FIG. 3 (e.g., for frame A( 0 : 8 ) odd-even position maintained, and for frame B( 0 : 8 ) position of odd-bits swapped with even bits).
- the Mux Sel signal may be operated at half of frame rate (which may be half of core clock rate) as shown in the timing waveform ( FIG. 5 ).
- FIGS. 6-8 illustrate various information relating to single ended mode implementation of even and odd frame combination data path architectures, according to some embodiments.
- a core Sends out data to IO via routing wires and/or buffers that are forwarded to a memory (e.g., a DDR memory) via two (e.g., DDR) drivers.
- a memory e.g., a DDR memory
- DDR two (e.g., DDR) drivers.
- PISO 700 a block diagram of a PISO 700 is shown, according to one embodiment.
- the PISO 700 (which may be used for the P2S 610 in an embodiment) may be operated based on full-rate IO clock and serialize data for pad.
- PISO 700 may include an even bank (e.g., labeled with 6 , 4 , 2 , and 0 in FIG. 7 ) and an odd bank (e.g., labeled with 7 , 5 , 3 , and 1 in FIG. 7 ) of storage devices (such as edge-triggered latches) in some embodiments.
- even bank e.g., labeled with 6 , 4 , 2 , and 0 in FIG. 7
- odd bank e.g., labeled with 7 , 5 , 3 , and 1 in FIG. 7
- storage devices such as edge-triggered latches
- a timing waveform for one of the two drivers is shown, according to an embodiment.
- the two drivers may be identical.
- DDR single ended
- 8-UI data burst DQ 8-UI data burst DQ
- 4-UI command/address (CA) lines e.g., using the combination buffer.
- a combination driver may support two DDR drivers.
- Each DDR driver may be sending data DQ (e.g., 8-UI data burst), or send out Command/Address (e.g., 4UI or lower speed) to a memory (e.g., including memory 1212 of FIG. 12 ).
- DDR UI size (e.g., in ps) may be in general 2 ⁇ bigger than the differential UI (e.g., 3.2 GTs differential vs. 1600 MTs DDR in some embodiments).
- DDR or single ended mode may utilized 8-UI PISO, operating based on the full-rate clock.
- the core may send out 4+4 (or eight) parallel low speed data to IO for two DDR drivers as discussed with reference to FIGS. 6-8 .
- IO clock may be about 1600 Ghz (e.g., full rate), where the core clock is at about 800 Mhz.
- Core to IO data rate may be 400 MTs (1 ⁇ 4th of pad rate), and four-to-one parallel to serial conversion may occur in IO PISO, as shown in FIGS. 6-9 .
- the ninth wire and ninth bit used in the differential mode may be ignored in DDR or single ended mode).
- FIGS. 9-11 illustrate various information relating to Command/Address (CA) for single ended mode (e.g., for a DDR mode) implementation of even and odd frame combination data path architectures, according to some embodiments.
- a core sends out a CA, e.g., in every core cycle (e.g., 400 MTs), to IO (e.g., stretched to two core cycles and staggered by one core cycle).
- CA Command/Address
- the PISO 1000 may captures the data using two staggered load signals (as shown FIG. 9-11 ), and serially drive the data to pad.
- two staggered load signals as shown FIG. 9-11
- 800 MTs CA needs to be sent out. ore running at 800 Mhz may generate one CA in every core clock cycle, which may be sent to IO (e.g., with each CA from core being stretched to two cycles, for example, to result in 400 MTs).
- each CA may be staggered by one core cycle, which is loaded and multiplexed in IO to send out 800 MTs CA at pad.
- PISO 1000 may include an even bank (e.g., labeled with 8 , 6 , 4 , 2 , and 0 in FIG. 10 ) and an odd bank (e.g., labeled with 7 , 5 , 3 , and 1 in FIG. 10 ) of storage devices (such as edge-triggered latches) in some embodiments.
- the techniques discussed may also be applied to combination receive (Rx) path, in which case the PISO logics discussed may be replaced with SIPO (Serial In Parallel Outs) logics. Additionally, the bit swapping may be performed by a memory controller prior to transmission to a FIFO, to SIPO, and consequently to drivers that provide the data to a processor.
- Rx receive
- SIPO Serial In Parallel Outs
- some of the embodiments discussed herein may allow for one or more of: (a) Resource sharing between 9UI, 8UI combination data paths (e.g., physical routing and resource leverage). (b) Direct conversion of parallels low speed data to high speed data at IO (e.g., 9 to 1 or 8 to 1 conversion in IO). No intermediate speed conversion or multiple levels of FIFO in between may be needed. (c) Reduced and/or optimized data path latency (e.g., less conversion, less levels of circuitry). (d) Lower power and clock loading (e.g., in part, since additional conversion and FIFO levels been removed). (e) Lower latency data path. (e) Improved power efficiency and simplicity over some current implementations.
- FIG. 12 illustrates a block diagram of a computing system 1200 in accordance with an embodiment of the invention.
- the computing system 1200 may include one or more central processing unit(s) (CPUs) 1202 or processors that communicate via an interconnection network (or bus) 1204 .
- the processors 1202 may include a general purpose processor, a network processor (that processes data communicated over a computer network 1203 ), or other types of a processor (including a reduced instruction set computer (RISC) processor or a complex instruction set computer (CISC)).
- RISC reduced instruction set computer
- CISC complex instruction set computer
- the processors 1202 may have a single or multiple core design.
- the processors 1202 with a multiple core design may integrate different types of processor cores on the same IC die.
- processors 1202 with a multiple core design may be implemented as symmetrical or asymmetrical multiprocessors.
- techniques discussed with reference to FIGS. 1-11 may be used to transmit data between various components of system 1200 (e.g., between the processor(s) 1202 and memory 1212 , between core(s) of processor(s) 1202 and memory controller 1212 , etc.).
- a chipset 1206 may also communicate with the interconnection network 1204 .
- the chipset 1206 may include a memory control hub (MCH) 1208 .
- the MCH 1208 may include a memory controller 1210 that communicates with a memory 1212 .
- the memory 1212 may store data, including sequences of instructions, that are executed by the CPU 1202 , or any other device included in the computing system 1200 . For example, operations may be coded into instructions (e.g., stored in the memory 1212 ) and executed by processor(s) 1202 .
- the memory 1212 may include one or more volatile storage (or memory) devices such as random access memory (RAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), static RAM (SRAM), or other types of storage devices.
- volatile storage or memory
- Nonvolatile memory may also be utilized such as a hard disk. Additional devices may communicate via the interconnection network 1204 , such as multiple CPUs and/or multiple system memories.
- the MCH 1208 may also include a graphics interface 1214 that communicates with a display device 1216 .
- the graphics interface 1214 may communicate with the display device 1216 via an accelerated graphics port (AGP).
- AGP accelerated graphics port
- the display 1216 (such as a flat panel display) may communicate with the graphics interface 1214 through, for example, a signal converter that translates a digital representation of an image stored in a storage device such as video memory or system memory into display signals that are interpreted and displayed by the display 1216 .
- the display signals produced by the display device may pass through various control devices before being interpreted by and subsequently displayed on the display 1216 .
- a hub interface 1218 may allow the MCH 1208 and an input/output control hub (ICH) 1220 to communicate.
- the ICH 1220 may provide an interface to I/O device(s) that communicate with the computing system 1200 .
- the ICH 1220 may communicate with a bus 1222 through a peripheral bridge (or controller) 1224 , such as a peripheral component interconnect (PCI) bridge, a universal serial bus (USB) controller, or other types of peripheral bridges or controllers.
- the bridge 1224 may provide a data path between the CPU 1202 and peripheral devices. Other types of topologies may be utilized.
- multiple buses may communicate with the ICH 1220 , e.g., through multiple bridges or controllers.
- peripherals in communication with the ICH 1220 may include, in various embodiments of the invention, integrated drive electronics (IDE) or small computer system interface (SCSI) hard drive(s), USB port(s), a keyboard, a mouse, parallel port(s), serial port(s), floppy disk drive(s), digital output support (e.g., digital video interface (DVI)), or other devices.
- IDE integrated drive electronics
- SCSI small computer system interface
- the bus 1222 may communicate with an audio device 1226 , one or more disk drive(s) 1228 , and a network interface device 1230 (which is in communication with the computer network 1203 ). Other devices may communicate via the bus 1222 . Also, various components (such as the network interface device 1230 ) may communicate with the MCH 1208 via a high speed (e.g., general purpose) I/O bus channel in some embodiments of the invention. In addition, the processor 1202 and other components shown in FIG. 12 (including but not limited to the MCH 1208 , one or more components of the MCH 1208 , etc.) may be combined to form a single chip. Furthermore, a graphics accelerator may be included within the MCH 1208 in other embodiments of the invention.
- nonvolatile memory may include one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically EPROM (EEPROM), a disk drive (e.g., 1228 ), a floppy disk, a compact disk ROM (CD-ROM), a digital versatile disk (DVD), flash memory, a magneto-optical disk, or other types of nonvolatile machine-readable media that are capable of storing electronic data (e.g., including instructions).
- ROM read-only memory
- PROM programmable ROM
- EPROM erasable PROM
- EEPROM electrically EPROM
- a disk drive e.g., 1228
- floppy disk e.g., floppy disk
- CD-ROM compact disk ROM
- DVD digital versatile disk
- flash memory e.g., a magneto-optical disk, or other types of nonvolatile machine-readable media that are capable of storing electronic data (e.g
- components of the system 1200 may be arranged in a point-to-point (PtP) configuration.
- processors, memory, and/or input/output devices may be interconnected by a number of point-to-point interfaces.
- Coupled may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements may not be in direct contact with each other, but may still cooperate or interact with each other.
Abstract
Description
- The present disclosure generally relates to the field of electronics. More particularly, an embodiment of the invention relates to an even and odd frame combination data path architecture.
- As processors increase their processing capabilities, one concern is the speed at which a main memory may be accessed by a processor. For example, to process data, a processor may need to first fetch data from a main memory. After completion of the processing, the results may need to be stored in the main memory. To improve performance, some processors may have access to a cache that temporarily stores the data. However, cache sizes are generally much smaller than a main memory. Thus, speed of an interface between a processor and a main memory may be a critical factor in overall computing performance.
- The detailed description is provided with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The use of the same reference numbers in different figures indicates similar or identical items.
-
FIG. 1A illustrates odd and even frame combination data path architectures, according to some embodiments of invention. -
FIG. 1B illustrates format of data and/or commands that may be written by the drivers shown inFIG. 1A . -
FIG. 2 illustrates an example of a transmit (Tx) Input/output (IO) buffer, in accordance with an embodiment. -
FIGS. 3-5 illustrate various information relating to differential mode implementation of even and odd frame combination data path architectures, according to some embodiments. -
FIGS. 6-8 illustrate various information relating to single ended mode implementation of even and odd frame combination data path architectures, according to some embodiments. -
FIGS. 9-11 illustrate various information relating to Command/Address (CA) for single ended mode implementation of even and odd frame combination data path architectures, according to some embodiments. -
FIG. 12 illustrates a block diagram of a computing system in accordance with an embodiment of the invention. - In the following description, numerous specific details are set forth in order to provide a thorough understanding of various embodiments. However, various embodiments of the invention may be practiced without the specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to obscure the particular embodiments of the invention. Further, various aspects of embodiments of the invention may be performed using various means, such as integrated semiconductor circuits (“hardware”), computer-readable instructions organized into one or more programs (“software”), or some combination of hardware and software. For the purposes of this disclosure reference to “logic” shall mean either hardware, software, or some combination thereof.
- Some of the embodiments discussed herein relate to even and odd frame combination data path architectures. In an embodiment, the techniques discussed here may be applied to a memory interface provided between a processor and a main memory. In some embodiments, parallel data may be received from a source, stored in a buffer (such as the FIFOs discussed with reference to
FIGS. 1-11 ). One or more serial bit streams may be generated based on the parallel data. As discussed herein, the same architecture may be used for handling both single ended signals and differential signals. Moreover, some embodiments may be provided in various environments, such as those discussed herein with reference toFIG. 12 , for example. -
FIG. 1A illustrates odd and even frame combination data path architectures, according to some embodiments of invention. In one embodiment, the architecture shown inFIG. 1A may be utilized for a combined differential (e.g., 9 UI (Unit Interval)) and single ended (SE) (e.g., DDR (Double Data Rate) or GDDR (Graphics DDR)) 2UI or 4UI memory) interface. The interface may be provided between a transmission source 102 (e.g., a processor) and a memory 104 (such as the main memory discussed with reference toFIG. 12 , for example). -
FIG. 1B illustrates format of data and/or commands that may be written by the corresponding drivers shown inFIG. 1A . For example, the top portion ofFIG. 1A illustrates a differential mode eCA (embedded Command/Address data)combination driver 106A with 9-UI frame size (e.g., as shown in the top portion ofFIG. 1B ). Also, the bottom portion ofFIG. 1A illustrates a single ended mode configuration of the driver shown in the top portion ofFIG. 1A . The singleended driver 106B may provide separate data and command/address (e.g., as shown in the bottom portion ofFIG. 1B with separate data (labeled with D0, D1, etc.) and command/address (labeled withCA 0,CA 1, etc. which may have double the UI of the data such as shown inFIG. 1B in one embodiment). Furthermore, as shown inFIG. 1A , a single differential driver may be multiplexed with two single ended (e.g., DDR) drivers depending on chip's operating mode, e.g., by utilizing a multiplexer 108. -
FIG. 2 illustrates an example of a transmit (Tx) Input/output (IO)buffer 200, in accordance with an embodiment. Thebuffer 200 may receive parallel (e.g., low speed)data 202 from a processor core and transfer the received data to a high speed IO clock (Clk) domain (e.g., using a FIFO (First In, First Out) buffer illustrated in box 204) with a PISO (Parallel Input, Serial Output) (shown in box 204) converting the transferred data to serial bit streams. The output of FIFO-PISO logic 204 may pass through a multiplexer 206 (e.g., to serialize the output signals from thelogic 204 in accordance with a transmit clock labeled as TxClkxx) and adriver 208 before being driven out to external world. In an embodiment, the PISO may include themultiplexor 206. - Moreover,
FIG. 2 illustrates a transmit data path at a high level, in accordance with one embodiment. For example, low speed parallel data driven from core 202 (e.g., from low speed core clock rate, such as 1/9UI rate or frame rate) to IO along with source sync (SS) clock (referred to herein sometimes as “clk” or “Clk”) (or could be without any SS clock, depending on the implementation, could be just on die wave-pipelined, etc.). Data may be first transferred to IO clock domain using a FIFO (e.g., stretching incoming data to absorb uncertainty between clock domains, and routing/physical difference within that data byte or signals group), and then parallel to serial conversion is done using the PISO, e.g., running at high speed IO clock (here IO clock is shown is half rate, 2-UI period) (generally odd-even data pipe). Transmit serial data is then driven out to pad/channel using thedriver 208. For simplicity SS (Source Sync) clock and FIFO may be omitted from subsequent figures. -
FIGS. 3-5 illustrate various information relating to differential mode implementation of even and odd frame combination data path architectures, according to some embodiments. Referring toFIG. 3 , parallel data is sent form acore 302 toIO 304 using 9-wires (e.g., atframe rate 1/9th) through routing wires and/or buffers 306. - Referring to
FIG. 4 , a block diagram of aPISO 400 is shown, according to one embodiment. The PISO 400 (which may be used forlogic 204 ofFIG. 2 and/or P2S (Parallel to Serial) orPISO logic 310 ofFIG. 3 ) may convert parallel data into serial data streams (e.g., even-odd steams and then multiplexed out). As shown,PISO 400 may include an even bank (e.g., labeled with 6, 4, 2, and 0 inFIG. 4 ) and an odd bank (e.g., labeled with 7, 5, 3, and 1 inFIG. 4 ) of storage devices (such as edge-triggered latches) in some embodiments. -
FIG. 5 illustrates a timing diagram, according to an embodiment of the invention. A multiplexor (also referred to herein as “Mux”) select (Mux Sel) in core may swap position of odd-even bits (here shown for data bit B(0:8)) for odd frames (in core), and a “load” signal (“Ld”) in IO loads data into PISO, and then serially transmits data to pad. According, the PISO may serialize the data. In an embodiment, the PISO converts 9-parallel data bits into two even-odd bit streams, which are than multiplexed and driven out to the pad. - In some embodiments, in differential mode (e.g., 9UI frame), the core may send out parallel low speed data to IO using 9 wires running at frame rate (e.g., 1/9th rate of pad data). Low speed parallel core data may be first loaded into PISO in IO running at high-speed local IO transmit clock (with a local “load” signal, where load generally defines a safe window for incoming data, for example, a signal running at
frame rate 1/9th; also, the load position is programmable in some embodiments). Since data frame is 9-UI in differential mode (not even, for example, 8UI or 4UI) and PISO converts the data to even-odd bit streams using half rate clock, position of even and odd bits on subsequent frames may be swapped (in alternate frames), such as shown inFIG. 5 . The swapping may be done using a multiplexer in the core as shown inFIG. 3 (e.g., for frame A(0:8) odd-even position maintained, and for frame B(0:8) position of odd-bits swapped with even bits). The Mux Sel signal may be operated at half of frame rate (which may be half of core clock rate) as shown in the timing waveform (FIG. 5 ). -
FIGS. 6-8 illustrate various information relating to single ended mode implementation of even and odd frame combination data path architectures, according to some embodiments. Referring toFIG. 6 , a core Sends out data to IO via routing wires and/or buffers that are forwarded to a memory (e.g., a DDR memory) via two (e.g., DDR) drivers. - Referring to
FIG. 7 , a block diagram of aPISO 700 is shown, according to one embodiment. The PISO 700 (which may be used for the P2S 610 in an embodiment) may be operated based on full-rate IO clock and serialize data for pad. As shown,PISO 700 may include an even bank (e.g., labeled with 6, 4, 2, and 0 inFIG. 7 ) and an odd bank (e.g., labeled with 7, 5, 3, and 1 inFIG. 7 ) of storage devices (such as edge-triggered latches) in some embodiments. - Referring to
FIG. 8 , a timing waveform for one of the two drivers (e.g., shown inFIGS. 6 and 7 ) is shown, according to an embodiment. In one embodiment, the two drivers may be identical. In single ended (e.g., DDR) mode, 8-UI data burst DQ, 4-UI command/address (CA) lines (e.g., using the combination buffer). In some embodiments, a combination driver may support two DDR drivers. Each DDR driver may be sending data DQ (e.g., 8-UI data burst), or send out Command/Address (e.g., 4UI or lower speed) to a memory (e.g., includingmemory 1212 ofFIG. 12 ). DDR UI size (e.g., in ps) may be in general 2× bigger than the differential UI (e.g., 3.2 GTs differential vs. 1600 MTs DDR in some embodiments). In one embodiment, DDR or single ended mode may utilized 8-UI PISO, operating based on the full-rate clock. The core may send out 4+4 (or eight) parallel low speed data to IO for two DDR drivers as discussed with reference toFIGS. 6-8 . As an example, for a pad data rate for 1600 MTs (UI=625 ps), IO clock may be about 1600 Ghz (e.g., full rate), where the core clock is at about 800 Mhz. Core to IO data rate may be 400 MTs (¼th of pad rate), and four-to-one parallel to serial conversion may occur in IO PISO, as shown inFIGS. 6-9 . In one embodiment, the ninth wire and ninth bit used in the differential mode may be ignored in DDR or single ended mode). -
FIGS. 9-11 illustrate various information relating to Command/Address (CA) for single ended mode (e.g., for a DDR mode) implementation of even and odd frame combination data path architectures, according to some embodiments. Referring toFIG. 9 , a core sends out a CA, e.g., in every core cycle (e.g., 400 MTs), to IO (e.g., stretched to two core cycles and staggered by one core cycle). - Referring to
FIG. 10 , a block diagram of aPISO 1000 is shown, according to one embodiment of the invention. The PISO 1000 (e.g., such as theP2S 910 in the IO partition) may captures the data using two staggered load signals (as shownFIG. 9-11 ), and serially drive the data to pad. For example for a 1600 MTs DDR DQ, 800 MTs CA needs to be sent out. ore running at 800 Mhz may generate one CA in every core clock cycle, which may be sent to IO (e.g., with each CA from core being stretched to two cycles, for example, to result in 400 MTs). Further, each CA may be staggered by one core cycle, which is loaded and multiplexed in IO to send out 800 MTs CA at pad. In some embodiments, for differential mode and single ended (e.g., DDR DQ) mode, loads may be the same (1d−xx=1dxx1), where in DDR CA mode 1dxx1=1dxx+2UI (shifted). As shown,PISO 1000 may include an even bank (e.g., labeled with 8, 6, 4, 2, and 0 inFIG. 10 ) and an odd bank (e.g., labeled with 7, 5, 3, and 1 inFIG. 10 ) of storage devices (such as edge-triggered latches) in some embodiments. - Additionally, even though in the present disclosure, the transmit path is used to illustrate the embodiments, the techniques discussed may also be applied to combination receive (Rx) path, in which case the PISO logics discussed may be replaced with SIPO (Serial In Parallel Outs) logics. Additionally, the bit swapping may be performed by a memory controller prior to transmission to a FIFO, to SIPO, and consequently to drivers that provide the data to a processor.
- Furthermore, some of the embodiments discussed herein may allow for one or more of: (a) Resource sharing between 9UI, 8UI combination data paths (e.g., physical routing and resource leverage). (b) Direct conversion of parallels low speed data to high speed data at IO (e.g., 9 to 1 or 8 to 1 conversion in IO). No intermediate speed conversion or multiple levels of FIFO in between may be needed. (c) Reduced and/or optimized data path latency (e.g., less conversion, less levels of circuitry). (d) Lower power and clock loading (e.g., in part, since additional conversion and FIFO levels been removed). (e) Lower latency data path. (e) Improved power efficiency and simplicity over some current implementations.
-
FIG. 12 illustrates a block diagram of acomputing system 1200 in accordance with an embodiment of the invention. Thecomputing system 1200 may include one or more central processing unit(s) (CPUs) 1202 or processors that communicate via an interconnection network (or bus) 1204. Theprocessors 1202 may include a general purpose processor, a network processor (that processes data communicated over a computer network 1203), or other types of a processor (including a reduced instruction set computer (RISC) processor or a complex instruction set computer (CISC)). Moreover, theprocessors 1202 may have a single or multiple core design. Theprocessors 1202 with a multiple core design may integrate different types of processor cores on the same IC die. Also, theprocessors 1202 with a multiple core design may be implemented as symmetrical or asymmetrical multiprocessors. In an embodiment, techniques discussed with reference toFIGS. 1-11 may be used to transmit data between various components of system 1200 (e.g., between the processor(s) 1202 andmemory 1212, between core(s) of processor(s) 1202 andmemory controller 1212, etc.). - A
chipset 1206 may also communicate with theinterconnection network 1204. Thechipset 1206 may include a memory control hub (MCH) 1208. TheMCH 1208 may include amemory controller 1210 that communicates with amemory 1212. Thememory 1212 may store data, including sequences of instructions, that are executed by theCPU 1202, or any other device included in thecomputing system 1200. For example, operations may be coded into instructions (e.g., stored in the memory 1212) and executed by processor(s) 1202. In one embodiment of the invention, thememory 1212 may include one or more volatile storage (or memory) devices such as random access memory (RAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), static RAM (SRAM), or other types of storage devices. Nonvolatile memory may also be utilized such as a hard disk. Additional devices may communicate via theinterconnection network 1204, such as multiple CPUs and/or multiple system memories. - The
MCH 1208 may also include agraphics interface 1214 that communicates with adisplay device 1216. In one embodiment of the invention, thegraphics interface 1214 may communicate with thedisplay device 1216 via an accelerated graphics port (AGP). In an embodiment of the invention, the display 1216 (such as a flat panel display) may communicate with the graphics interface 1214 through, for example, a signal converter that translates a digital representation of an image stored in a storage device such as video memory or system memory into display signals that are interpreted and displayed by thedisplay 1216. The display signals produced by the display device may pass through various control devices before being interpreted by and subsequently displayed on thedisplay 1216. - A
hub interface 1218 may allow theMCH 1208 and an input/output control hub (ICH) 1220 to communicate. TheICH 1220 may provide an interface to I/O device(s) that communicate with thecomputing system 1200. TheICH 1220 may communicate with abus 1222 through a peripheral bridge (or controller) 1224, such as a peripheral component interconnect (PCI) bridge, a universal serial bus (USB) controller, or other types of peripheral bridges or controllers. Thebridge 1224 may provide a data path between theCPU 1202 and peripheral devices. Other types of topologies may be utilized. Also, multiple buses may communicate with theICH 1220, e.g., through multiple bridges or controllers. Moreover, other peripherals in communication with theICH 1220 may include, in various embodiments of the invention, integrated drive electronics (IDE) or small computer system interface (SCSI) hard drive(s), USB port(s), a keyboard, a mouse, parallel port(s), serial port(s), floppy disk drive(s), digital output support (e.g., digital video interface (DVI)), or other devices. - The
bus 1222 may communicate with anaudio device 1226, one or more disk drive(s) 1228, and a network interface device 1230 (which is in communication with the computer network 1203). Other devices may communicate via thebus 1222. Also, various components (such as the network interface device 1230) may communicate with theMCH 1208 via a high speed (e.g., general purpose) I/O bus channel in some embodiments of the invention. In addition, theprocessor 1202 and other components shown inFIG. 12 (including but not limited to theMCH 1208, one or more components of theMCH 1208, etc.) may be combined to form a single chip. Furthermore, a graphics accelerator may be included within theMCH 1208 in other embodiments of the invention. - Furthermore, the
computing system 1200 may include volatile and/or nonvolatile memory (or storage). For example, nonvolatile memory may include one or more of the following: read-only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically EPROM (EEPROM), a disk drive (e.g., 1228), a floppy disk, a compact disk ROM (CD-ROM), a digital versatile disk (DVD), flash memory, a magneto-optical disk, or other types of nonvolatile machine-readable media that are capable of storing electronic data (e.g., including instructions). - In an embodiment, components of the
system 1200 may be arranged in a point-to-point (PtP) configuration. For example, processors, memory, and/or input/output devices may be interconnected by a number of point-to-point interfaces. - Reference in the specification to “one embodiment,” “an embodiment,” or “some embodiments” means that a particular feature, structure, or characteristic described in connection with the embodiment(s) may be included in at least an implementation. The appearances of the phrase “in one embodiment” in various places in the specification may or may not be all referring to the same embodiment.
- Also, in the description and claims, the terms “coupled” and “connected,” along with their derivatives, may be used. In some embodiments of the invention, “connected” may be used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” may mean that two or more elements are in direct physical or electrical contact. However, “coupled” may also mean that two or more elements may not be in direct contact with each other, but may still cooperate or interact with each other.
- Thus, although embodiments of the invention have been described in language specific to structural features and/or methodological acts, it is to be understood that claimed subject matter may not be limited to the specific features or acts described. Rather, the specific features and acts are disclosed as sample forms of implementing the claimed subject matter.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/006,247 US8225016B2 (en) | 2007-12-31 | 2007-12-31 | Even and odd frame combination data path architecture |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/006,247 US8225016B2 (en) | 2007-12-31 | 2007-12-31 | Even and odd frame combination data path architecture |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090172215A1 true US20090172215A1 (en) | 2009-07-02 |
US8225016B2 US8225016B2 (en) | 2012-07-17 |
Family
ID=40799964
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/006,247 Expired - Fee Related US8225016B2 (en) | 2007-12-31 | 2007-12-31 | Even and odd frame combination data path architecture |
Country Status (1)
Country | Link |
---|---|
US (1) | US8225016B2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104050130A (en) * | 2013-03-15 | 2014-09-17 | 辉达公司 | On-package multiprocessor ground-referenced single-ended interconnect |
US20150269112A1 (en) * | 2014-03-18 | 2015-09-24 | Tzu-Chien Hsueh | Reconfigurable transmitter |
CN107645672A (en) * | 2017-08-24 | 2018-01-30 | 长芯盛(武汉)科技有限公司 | A kind of multi-medium data line for being easy to low speed signal transmission |
US20220301651A1 (en) * | 2021-03-18 | 2022-09-22 | Innogrit Technologies Co., Ltd. | Memory controller physical interface with differential loopback testing |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102200489B1 (en) * | 2014-05-30 | 2021-01-11 | 삼성전자주식회사 | Nonvolatile memory device and storage device having the same |
US10241938B1 (en) | 2017-12-20 | 2019-03-26 | Sandisk Technologies Llc | Output data path for non-volatile memory |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5077656A (en) * | 1986-03-20 | 1991-12-31 | Channelnet Corporation | CPU channel to control unit extender |
US20060195631A1 (en) * | 2005-01-31 | 2006-08-31 | Ramasubramanian Rajamani | Memory buffers for merging local data from memory modules |
US20060268723A1 (en) * | 2005-05-24 | 2006-11-30 | Danny Vogel | Selective test point for high speed SERDES cores in semiconductor design |
US20070005831A1 (en) * | 2005-06-30 | 2007-01-04 | Peter Gregorius | Semiconductor memory system |
US20070220401A1 (en) * | 2006-02-27 | 2007-09-20 | Intel Corporation | Systems, methods, and apparatuses for using the same memory type to support an error check mode and a non-error check mode |
US20080115039A1 (en) * | 2006-10-31 | 2008-05-15 | Intel Corporation | Destination indication to aid in posted write buffer loading |
US20080222443A1 (en) * | 2005-01-14 | 2008-09-11 | Qimonda Ag | Controller |
US7590789B2 (en) * | 2007-12-07 | 2009-09-15 | Intel Corporation | Optimizing clock crossing and data path latency |
-
2007
- 2007-12-31 US US12/006,247 patent/US8225016B2/en not_active Expired - Fee Related
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5077656A (en) * | 1986-03-20 | 1991-12-31 | Channelnet Corporation | CPU channel to control unit extender |
US20080222443A1 (en) * | 2005-01-14 | 2008-09-11 | Qimonda Ag | Controller |
US20060195631A1 (en) * | 2005-01-31 | 2006-08-31 | Ramasubramanian Rajamani | Memory buffers for merging local data from memory modules |
US20090013108A1 (en) * | 2005-01-31 | 2009-01-08 | Intel Corporation | Memory buffers for merging local data from memory modules |
US20060268723A1 (en) * | 2005-05-24 | 2006-11-30 | Danny Vogel | Selective test point for high speed SERDES cores in semiconductor design |
US20070005831A1 (en) * | 2005-06-30 | 2007-01-04 | Peter Gregorius | Semiconductor memory system |
US20070220401A1 (en) * | 2006-02-27 | 2007-09-20 | Intel Corporation | Systems, methods, and apparatuses for using the same memory type to support an error check mode and a non-error check mode |
US20080115039A1 (en) * | 2006-10-31 | 2008-05-15 | Intel Corporation | Destination indication to aid in posted write buffer loading |
US7590789B2 (en) * | 2007-12-07 | 2009-09-15 | Intel Corporation | Optimizing clock crossing and data path latency |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104050130A (en) * | 2013-03-15 | 2014-09-17 | 辉达公司 | On-package multiprocessor ground-referenced single-ended interconnect |
US20150269112A1 (en) * | 2014-03-18 | 2015-09-24 | Tzu-Chien Hsueh | Reconfigurable transmitter |
US9582454B2 (en) * | 2014-03-18 | 2017-02-28 | Intel Corporation | Reconfigurable transmitter |
US10216680B2 (en) | 2014-03-18 | 2019-02-26 | Intel Corporation | Reconfigurable transmitter |
US10664430B2 (en) | 2014-03-18 | 2020-05-26 | Intel Corporation | Reconfigurable transmitter |
US11126581B2 (en) | 2014-03-18 | 2021-09-21 | Intel Corporation | Reconfigurable transmitter |
CN107645672A (en) * | 2017-08-24 | 2018-01-30 | 长芯盛(武汉)科技有限公司 | A kind of multi-medium data line for being easy to low speed signal transmission |
US20220301651A1 (en) * | 2021-03-18 | 2022-09-22 | Innogrit Technologies Co., Ltd. | Memory controller physical interface with differential loopback testing |
US11594296B2 (en) * | 2021-03-18 | 2023-02-28 | Innogrit Technologies Co., Ltd. | Memory controller physical interface with differential loopback testing |
Also Published As
Publication number | Publication date |
---|---|
US8225016B2 (en) | 2012-07-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8225016B2 (en) | Even and odd frame combination data path architecture | |
JP5578450B2 (en) | Multi-serial interface stacked die memory architecture | |
KR101525282B1 (en) | Switched interface stacked-die memory architecture | |
US7567471B2 (en) | High speed fanned out system architecture and input/output circuits for non-volatile memory | |
CN101055768B (en) | Semiconductor memory device | |
TW446945B (en) | High bandwidth DRAM with low operating power modes | |
US20030174569A1 (en) | System and method for translation of SDRAM and DDR signals | |
CN101055767A (en) | Test operation of multi-port memory device | |
JP5456863B2 (en) | Zero delay slave mode transmission for audio interface | |
US10984730B2 (en) | Display driver integrated circuit, display system, and method for driving display driver integrated circuit | |
JP5125028B2 (en) | Integrated circuit | |
KR100712508B1 (en) | Configuration of memory device | |
KR20000039713A (en) | Device for controlling interface of frame buffer | |
TW200619956A (en) | Data buffer circuit, interface circuit and control method therefor | |
TW318904B (en) | ||
JP3780419B2 (en) | Data transfer control device and electronic device | |
JP2003007052A (en) | Semiconductor memory and memory system using it | |
JP4956295B2 (en) | Semiconductor memory device | |
CN101267459B (en) | Data output method and data buffer employing asynchronous FIFO register output data | |
KR100903382B1 (en) | Multi-port memory device having serial i/o interface | |
US9013337B2 (en) | Data input/output device and system including the same | |
US9268725B2 (en) | Data transferring apparatus and data transferring method | |
KR20090117009A (en) | Memory system for seamless switching | |
KR100259293B1 (en) | On-screen display apparatus of digital tv | |
KR0184780B1 (en) | Memory interface method and apparatus thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTEL CORPORATION,CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RASHID, MAMUN UR;REEL/FRAME:023935/0258 Effective date: 20080314 Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RASHID, MAMUN UR;REEL/FRAME:023935/0258 Effective date: 20080314 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20200717 |