US11016679B2 - Balanced die set execution in a data storage system - Google Patents
Balanced die set execution in a data storage system Download PDFInfo
- Publication number
- US11016679B2 US11016679B2 US16/023,420 US201816023420A US11016679B2 US 11016679 B2 US11016679 B2 US 11016679B2 US 201816023420 A US201816023420 A US 201816023420A US 11016679 B2 US11016679 B2 US 11016679B2
- Authority
- US
- United States
- Prior art keywords
- data access
- access command
- die set
- data
- quality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
- 238000013500 data storage Methods 0.000 title claims abstract description 78
- 239000004065 semiconductor Substances 0.000 claims abstract description 14
- 238000000034 method Methods 0.000 claims description 29
- 238000012360 testing method Methods 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 5
- 230000001934 delay Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 230000002829 reductive effect Effects 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 230000004075 alteration Effects 0.000 claims 1
- 230000002441 reversible effect Effects 0.000 description 23
- 238000012546 transfer Methods 0.000 description 16
- 239000000872 buffer Substances 0.000 description 11
- 238000007726 management method Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 10
- 230000003321 amplification Effects 0.000 description 9
- 238000003199 nucleic acid amplification method Methods 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000012986 modification Methods 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 230000000694 effects Effects 0.000 description 7
- 238000011156 evaluation Methods 0.000 description 6
- 239000007787 solid Substances 0.000 description 6
- 238000012423 maintenance Methods 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 238000013144 data compression Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 241000053227 Themus Species 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012384 transportation and delivery Methods 0.000 description 3
- 239000000470 constituent Substances 0.000 description 2
- 230000001351 cycling effect Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000003292 diminished effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 230000000246 remedial effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0673—Single storage device
- G06F3/0679—Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C11/00—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor
- G11C11/21—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements
- G11C11/34—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices
- G11C11/40—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors
- G11C11/401—Digital stores characterised by the use of particular electric or magnetic storage elements; Storage elements therefor using electric elements using semiconductor devices using transistors forming cells needing refreshing or charge regeneration, i.e. dynamic cells
- G11C11/4063—Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing or timing
- G11C11/407—Auxiliary circuits, e.g. for addressing, decoding, driving, writing, sensing or timing for memory cells of the field-effect type
- G11C11/409—Read-write [R-W] circuits
- G11C11/4093—Input/output [I/O] data interface arrangements, e.g. data buffers
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C5/00—Details of stores covered by group G11C11/00
- G11C5/02—Disposition of storage elements, e.g. in the form of a matrix array
- G11C5/025—Geometric lay-out considerations of storage- and peripheral-blocks in a semiconductor storage device
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C8/00—Arrangements for selecting an address in a digital store
- G11C8/12—Group selection circuits, e.g. for memory block selection, chip selection, array selection
Definitions
- Various embodiments of the present disclosure are generally directed to the management of operations in a memory, such as, but not limited to, a flash memory in a solid state drive (SSD).
- a memory such as, but not limited to, a flash memory in a solid state drive (SSD).
- SSD solid state drive
- a data storage semiconductor memory has a semiconductor memory divided into a plurality of die sets where performance metrics of execution of a first data access command to a first die set and of a second data access command to a second die set are measured.
- a proactive strategy is generated to maintain consistent data access command execution performance with a quality of service module based on the measured performance metrics and a third data access command is altered, as directed by the proactive strategy, to prevent a predicted non-uniformity of data access command performance between the first die set and the second die set.
- Performance metrics are measured, in other embodiments, for a first data access command to a first die set of a semiconductor memory and for a second data access command to a second die set of the semiconductor memory.
- a proactive strategy is generated to maintain consistent data access command execution performance with a quality of service module based on the measured performance metrics.
- a reactive strategy is generated to maintain consistent data access command execution performance with the quality of service module based on detection of an unpredicted event.
- At least a third data access command is altered, as directed by the proactive strategy, to prevent a predicted non-uniformity of data access command performance between the first die set and the second die set.
- a quality of service module is connected to a first die set and a second die set of a semiconductor memory with the quality of service module configured to measure performance metrics of execution of a first data access command to the first die set and of a second data access command to a second die set.
- a prediction circuit of the quality of service module generates a proactive strategy to maintain consistent data access command execution performance with a quality of service module based on the measured performance metrics and a scheduler circuit of the quality of service module is configured to alter a third data access command, as directed by the proactive strategy, to prevent a predicted non-uniformity of data access command performance between the first die set and the second die set.
- FIG. 1 provides a functional block representation of a data storage device in accordance with various embodiments.
- FIG. 2 shows aspects of the device of FIG. 1 characterized as a solid state drive (SSD) in accordance with some embodiments.
- SSD solid state drive
- FIG. 3 is an arrangement of the flash memory of FIG. 2 in some embodiments.
- FIG. 4 illustrates the use of channels to access the dies in FIG. 3 in some embodiments.
- FIG. 5 represents a map unit (MU) as a data arrangement stored to the flash memory of FIG. 2 .
- MU map unit
- FIG. 6 shows a functional block diagram for a GCU management circuit of the SSD in accordance with some embodiments.
- FIG. 7 illustrates an arrangement of various GCUs and corresponding tables of verified GCUs (TOVGs) for a number of different die sets in some embodiments.
- FIG. 8 displays a functional block diagram for a GCU management circuit of the SSD in accordance with some embodiments.
- FIG. 9 depicts an arrangement of various GCUs and corresponding tables of verified GCUs (TOVGs) for a number of different die sets in some embodiments.
- FIG. 10 illustrates an example data set that can be written to the data storage device of FIG. 1 in accordance with assorted embodiments.
- FIG. 11 plots operational data for an example data storage system employing various embodiments of the present disclosure.
- FIG. 12 conveys a block representation of an example data storage system in which various embodiments may be practiced.
- FIG. 13 represents portions of an example data storage system configured in accordance with various embodiments.
- FIG. 14 shows an example quality of service module capable of being used in a data storage system in accordance with some embodiments.
- FIG. 15 is a flowchart of an example data access process that can be conducted in a data storage system in accordance with assorted embodiments.
- FIG. 16 is a flowchart of an example data access process capable of being utilized in a data storage system in accordance with various embodiments.
- FIG. 17 is an example balancing routine that can be carried out with the respective embodiments of FIGS. 1-16 .
- the various embodiments disclosed herein are generally directed to managing data access and data maintenance operations in one or more data storage devices of a data storage system to provide balanced data access operation execution.
- SSDs are data storage devices that store user data in non-volatile memory (NVM) made up of an array of solid-state semiconductor memory cells.
- SSDs usually have an NVM module and a controller. The controller controls the transfer of data between the NVM and a host device.
- the NVM will usually be NAND flash memory, but other forms of solid-state memory can be used.
- a flash memory module may be arranged as a series of dies.
- a die represents a separate, physical block of semiconductor memory cells.
- the controller communicates with the dies using a number of channels, or lanes, with each channel connected to a different subset of the dies. Any respective numbers of channels and dies can be used. Groups of dies may be arranged into die sets, which may correspond with the NVMe (Non-Volatile Memory Express) Standard. This standard enables multiple owners (users) to access and control separate portions of a given SSD (or other memory device).
- NVMe Non-Volatile Memory Express
- Metadata is often generated and used to describe and control the data stored to an SSD.
- the metadata may take the form of one or more map structures that track the locations of data blocks written to various GCUs (garbage collection units), which are sets of erasure blocks that are erased and allocated as a unit.
- the map structures can include a forward map and a reverse directory, although other forms can be used.
- the forward map provides an overall map structure that can be accessed by a controller to service a received host access command (e.g., a write command, a read command, etc).
- the forward map may take the form of a two-level map, where a first level of the map maintains the locations of map pages and a second level of the map provides a flash transition layer (FTL) to provide association of logical addresses of the data blocks to physical addresses at which the blocks are stored.
- FTL flash transition layer
- Other forms of maps can be used including single level maps and three-or-more level maps, but each generally provides a forward map structure in which pointers may be used to point to each successive block until the most current version is located.
- the reverse directory can be written to the various GCUs and provides local data identifying, by logical address, which data blocks are stored in the associated GCU.
- the reverse directory also sometimes referred to as a footer, thus provides a physical to logical association for the locally stored blocks.
- the reverse directory can take any number of suitable forms. Reverse directories are particularly useful during garbage collection operations, since a reverse directory can be used to determine which data blocks are still current and should be relocated before the associated erasure blocks in the GCU are erased.
- the SSDs expend a significant amount of resources on maintaining accurate and up-to-date map structures. Nevertheless, it is possible from time to time to have a mismatch between the forward map and the reverse directory for a given GCU. These situations are usually noted at the time of garbage collection.
- the forward map may indicate that there are X valid data blocks in a given erasure block (EB), but the reverse directory identifies a different number Y valid blocks in the EB.
- EB erasure block
- the garbage collection operation may be rescheduled or may take a longer period of time to complete while the system obtains a correct count before proceeding with the recycling operation.
- the NVMe specification provides that a storage device should have the ability to provide guaranteed levels of deterministic performance for specified periods of time (deterministic windows, or DWs). To the extent that a garbage collection operation is scheduled during a DW, it is desirable to ensure that the actual time that the garbage collection operation would require to complete is an accurate estimate in order for the system to decide whether and when to carry out the GC operation.
- deterministic windows or DWs
- SSDs include a top level controller circuit and a flash (or other semiconductor) memory module.
- a number of channels, or lanes, are provided to enable communications between the controller and dies within the flash memory.
- the dies are further subdivided into planes, GCUs, erasure blocks, pages, etc. Groups of dies may be arranged into separate die sets, or namespaces. This allows the various die sets to be concurrently serviced for different owners (users).
- a data storage device generally carries out three (3) main operations: (1) hot data transfers during which user data sets are written to or read from the flash memory; (2) cold data transfers during which the device carries out garbage collection and other operations to free up memory for the storage of new data; and (3) map updates in which snapshots and journals are accumulated and written to maintain an up-to-date system map of the memory locations in which data sets are stored.
- a data storage device can periodically enter a deterministic window (DW) during which certain operational performance is guaranteed, such as guaranteed data delivery without retransmission.
- DW deterministic window
- the specification is not clear on exactly how long the DW is required to last, or by what metrics the device can be measured.
- One example of a DW performance is that X number of reads can be carried out at a certain minimum data transfer rate; another is that so many blocks may be written to completion within a particular period of time. It is contemplated that a user can declare a DW at substantially any given time, and it is not usually known when a DW will be declared.
- NDW non-deterministic window
- a number of channels, or lanes may be provided in a data storage system to enable communications between one or more controllers and dies within the flash memory.
- the operating capacity of flash memory can be divided among a number of die sets, with each die set controlled by a different owner/user/host.
- data access e.g. read, write etc.
- many different users/owners/hosts can be concurrently access a given SSD and generate data access (e.g. read, write etc.) commands, such as data reads, data writes, and background operations.
- a data storage system with limited number of resources such as available local memory (e.g., DRAM, buffers, etc.), power budget, and processing, operation management is emphasized in order to maintain adequate performance for all users/owners.
- embodiments are directed to optimizing I/O by keeping a running estimation of allocated work per resource, such as how much time it takes to perform a write, a read, etc., to choose which work task to perform next in order to balance data access execution performance across all owners/hosts in a data storage system.
- embodiments maintain estimates of time required to perform certain tasks, and deciding whether we can do something else within a given time slot. Such considerations can be customized to DW intervals so that data access commands and/or background tasks are executed based on reliable estimations of how long execution will take, which can optimize the data read performance consistency during a DW interval.
- Reliable prediction of how long a task will take to execute may involve gathering and maintaining background metrics on actual tasks that have been performed.
- the collected execution data can be placed into buckets or other groupings based on different scenarios, such as high, medium, or low utilization times. From this, a data storage system can estimate how much time a given task can realistically take, and schedules tasks, and/or alter task priority, as needed to obtain optimum throughput for all owners.
- Minimum rates of I/O can be established for certain NDW and DW intervals with priority given as required to ensure the data storage system provides consistent performance to each connected host.
- data storage system performance is balanced among die sets by allocating additional power and buffer resources to specific die sets when write amplification conditions have increased through loss of capacity and/or overprovisioning. By sacrificing performance for other die sets to equalize performance across all die sets, data accesses can be balanced to provide reliable and predictable performance to multiple different hosts/owners.
- a data storage system can switch to one or more of the die sets, which gives priority to the die set in order to balance out overall system performance.
- balancing data access execution across different die sets may be counter-intuitive compared to providing maximum data access performance when available and promoting activities for lesser utilized sets than greater utilized sets, but the goal is to meet a consistent quality of service for every user/host rather than maximizing performance.
- a data storage system could be penalized for providing too fast a data access execution time inconsistently to one or more owners/hosts instead of consistency for all user/hosts.
- the ability to adapt data storage system configuration and execution of data accesses, particularly when the available amount of overprovisioning decreases as data capacity reaches full, data access requests can be managed to maintain consistency among users/hosts despite dynamic operating conditions
- the device 100 has a controller 102 and a memory module 104 .
- the controller block 102 represents a hardware-based and/or programmable processor-based circuit configured to provide top level communication and control functions.
- the memory module 104 includes solid state non-volatile memory (NVM) for the storage of user data from one or more host devices 106 , such as other data storage devices, network server, network node, or remote controller.
- NVM solid state non-volatile memory
- FIG. 2 displays an example data storage device 110 generally corresponding to the device 100 in FIG. 1 .
- the device 110 is configured as a solid state drive (SSD) that communicates with one or more host devices via one or more Peripheral Component Interface Express (PCIe) ports, although other configurations can be used.
- SSD solid state drive
- PCIe Peripheral Component Interface Express
- the NVM is contemplated as comprising NAND flash memory, although other forms of solid state non-volatile memory can be used.
- the SS D operates in accordance with the NVMe (Non-Volatile Memory Express) Standard, which enables different users to allocate die sets for use in the storage of data.
- NVMe Non-Volatile Memory Express
- Each die set may form a portion of a Namespace that may span multiple SSDs or be contained within a single SSD.
- the SSD 110 includes a controller circuit 112 with a front end controller 114 , a core controller 116 and a back end controller 118 .
- the front end controller 114 performs host I/F functions
- the back end controller 118 directs data transfers with the memory module 114
- the core controller 116 provides top level control for the device.
- Each controller 114 , 116 and 118 includes a separate programmable processor with associated programming (e.g., firmware, FW) in a suitable memory location, as well as various hardware elements to execute data management and transfer functions.
- programming e.g., firmware, FW
- a pure hardware based controller configuration can also be used.
- the various controllers may be integrated into a single system on chip (SOC) integrated circuit device, or may be distributed among various discrete devices as required.
- SOC system on chip
- a controller memory 120 represents various forms of volatile and/or non-volatile memory (e.g., SRAM, DDR DRAM, flash, etc.) utilized as local memory by the controller 112 .
- Various data structures and data sets may be stored by the memory including one or more map structures 122 , one or more caches 124 for map data and other control information, and one or more data buffers 126 for the temporary storage of host (user) data during data transfers.
- a non-processor based hardware assist circuit 128 may enable the offloading of certain memory management tasks by one or more of the controllers as required.
- the hardware circuit 128 does not utilize a programmable processor, but instead uses various forms of hardwired logic circuitry such as application specific integrated circuits (ASICs), gate logic circuits, field programmable gate arrays (FPGAs), etc.
- ASICs application specific integrated circuits
- FPGAs field programmable gate arrays
- Additional functional blocks can be realized in hardware and/or firmware in the controller 112 , such as a data compression block 130 and an encryption block 132 .
- the data compression block 130 applies lossless data compression to input data sets during write operations, and subsequently provides data de-compression during read operations.
- the encryption block 132 provides any number of cryptographic functions to input data including encryption, hashes, decompression, etc.
- a device management module (DMM) 134 supports back end processing operations and may include an outer code engine circuit 136 to generate outer code, a device I/F logic circuit 137 and a low density parity check (LDPC) circuit 138 configured to generate LDPC codes as part of the error detection and correction strategy used to protect the data stored by the by the SSD 110 .
- DMM device management module
- LDPC low density parity check
- a memory module 140 corresponds to the memory 104 in FIG. 1 and includes a non-volatile memory (NVM) in the form of a flash memory 142 distributed across a plural number N of flash memory dies 144 .
- Rudimentary flash memory control electronics may be provisioned on each die 144 to facilitate parallel data transfer operations via one or more channels (lanes) 146 .
- FIG. 3 shows an arrangement of the various flash memory dies 144 in the flash memory 142 of FIG. 2 in some embodiments. Other configurations can be used.
- the smallest unit of memory that can be accessed at a time is referred to as a page 150 .
- a page may be formed using a number of flash memory cells that share a common word line.
- the storage size of a page can vary; current generation flash memory pages can store, in some cases, 16 KB (16,384 bytes) of user data.
- the memory cells 148 associated with a number of pages are integrated into an erasure block 152 , which represents the smallest grouping of memory cells that can be concurrently erased in a NAND flash memory.
- a number of erasure blocks 152 are turn incorporated into a garbage collection unit (GCU) 154 , which are logical structures that utilize erasure blocks that are selected from different dies. GCUs are allocated and erased as a unit.
- GCU may be formed by selecting one or more erasure blocks from each of a population of dies so that the GCU spans the population of dies.
- Each die 144 may include a plurality of planes 156 . Examples include two planes per die, four planes per die, etc. although other arrangements can be used. Generally, a plane is a subdivision of the die 144 arranged with separate read/write/erase circuitry such that a given type of access operation (such as a write operation, etc.) can be carried out simultaneously by each of the planes to a common page address within the respective planes.
- FIG. 4 shows further aspects of the flash memory 142 in some embodiments.
- a total number K dies 144 are provided and arranged into physical die groups 158 .
- Each die group 158 is connected to a separate channel 146 using a total number of L channels.
- K is set to 128 dies
- L is set to 8 channels
- each physical die group has 16 dies.
- a single die within each physical die group can be accessed at a time using the associated channel.
- a flash memory electronics (FME) circuit 160 of the flash memory module 142 controls each of the channels 146 to transfer data to and from the dies 144 .
- FME flash memory electronics
- the various dies are arranged into one or more die sets.
- a die set represents a portion of the storage capacity of the SSD that is allocated for use by a particular host (user/owner).
- die sets are usually established with a granularity at the die level, so that some percentage of the total available dies 144 will be allocated for incorporation into a given die set.
- a first example die set is denoted at 162 in FIG. 4 .
- This first set 162 uses a single die 144 from each of the different channels 146 .
- This arrangement provides fast performance during the servicing of data transfer commands for the set since all eight channels 146 are used to transfer the associated data.
- a limitation with this approach is that if the set 162 is being serviced, no other die sets can be serviced during that time interval. While the set 162 only uses a single die from each channel, the set could also be configured to use multiple dies from each channel, such as 16 dies/channel, 32 dies/channel, etc.
- a second example die set is denoted at 164 in FIG. 4 .
- This set uses dies 144 from less than all of the available channels 146 .
- This arrangement provides relatively slower overall performance during data transfers as compared to the set 162 , since for a given size of data transfer, the data will be transferred using fewer channels.
- this arrangement advantageously allows the SSD to service multiple die sets at the same time, provided the sets do not share the same (e.g., an overlapping) channel 146 .
- FIG. 5 illustrates a manner in which data may be stored to the flash memory module 142 .
- Map units (MUs) 170 represent fixed sized blocks of data that are made up of one or more user logical block address units (LBAs) 172 supplied by the host.
- LBAs 172 may have a first nominal size, such as 512 bytes (B), 1024 B (1 KB), etc.
- the MUs 170 may have a second nominal size, such as 4096 B (4 KB), etc.
- the application of data compression may cause each MU to have a smaller size in terms of actual bits written to the flash memory 142 .
- the MUs 170 are arranged into the aforementioned pages 150 ( FIG. 3 ) which are written to the memory 142 .
- a common control line e.g., word line
- MLCs multi-level cells
- TLCs three-level cells
- XLCs four level cells
- Data stored by an SSD are often managed using metadata.
- the metadata provide map structures to track the locations of various data blocks (e.g., MUAs 170 ) to enable the SSD 110 to locate the physical location of existing data. For example, during the servicing of a read command it is generally necessary to locate the physical address within the flash memory 144 at which the most current version of a requested block (e.g., LBA) is stored, so that the controller can schedule and execute a read operation to return the requested data to the host.
- a requested block e.g., LBA
- FIG. 6 shows a functional block diagram for a GCU management circuit 180 of the SSD 110 in accordance with some embodiments.
- the circuit 180 may form a portion of the controller 112 and may be realized using hardware circuitry and/or one or more programmable processor circuits with associated firmware in memory.
- the circuit 180 includes the use of a forward map 182 and a reverse directory 184 .
- the forward map and reverse directory are metadata data structures that describe the locations of the data blocks in the flash memory 142 .
- the respective portions of these data structures are located in the flash memory or other non-volatile memory location and copied to local memory 120 (see e.g., FIG. 2 ).
- the forward map 182 provides a flash transition layer (FTL) to generally provide a correlation between the logical addresses of various blocks (e.g., MUAs) and the physical addresses at which the various blocks are stored (e.g., die set, die, plane, GCU, EB, page, bit offset, etc.).
- FTL flash transition layer
- the contents of the forward map 182 may be stored in specially configured and designated GCUs in each die set.
- the reverse directory 184 provides a physical address to logical address correlation.
- the reverse directory contents may be written as part of the data writing process to each GCU, such as in the form of a header or footer along with the data being written.
- the reverse directory provides an updated indication of how many of the data blocks (e.g., MUAs) are valid (e.g., represent the most current version of the associated data).
- the circuit 180 further includes a map integrity control circuit 186 .
- this control circuit 186 generally operates at selected times to recall and compare, for a given GCU, the forward map data and the reverse directory data.
- This evaluation step includes processing to determine if both metadata structures indicate the same number and identify of the valid data blocks in the GCU.
- the GCU is added to a list of verified GCUs in a data structure referred to as a table of verified GCUs, or TOVG 188 .
- the table can take any suitable form and can include a number of entries, with one entry for each GCU. Each entry can list the GCU as well as other suitable and useful information, such as but not limited to a time stamp at which the evaluation took place, the total number of valid data blocks that were determined to be present at the time of validation, a listing of the actual valid blocks, etc.
- control circuit 186 can further operate to perform a detailed evaluation to correct the mismatch. This may include replaying other journals or other data structures to trace the history of those data blocks found to be mismatched. The level of evaluation required will depend on the extent of the mismatch between the respective metadata structures.
- the forward map 182 indicates that there should be some number X valid blocks in the selected GCU, such as 12 valid blocks, but the reverse directory 184 indicates that there are only Y valid blocks, such as 11 valid blocks, and the 11 valid blocks indicated by the reverse directory 184 are indicated as valid by the forward map, then the focus can be upon the remaining one block that is valid according to the forward map but invalid according to the reverse directory.
- X valid blocks in the selected GCU such as 12 valid blocks
- the reverse directory 184 indicates that there are only Y valid blocks, such as 11 valid blocks, and the 11 valid blocks indicated by the reverse directory 184 are indicated as valid by the forward map
- an exception list 190 may be formed as a data structure in memory of GCUs that have been found to require further evaluation. In this way, the GCUs can be evaluated later at an appropriate time for resolution, after which the corrected GCUs can be placed on the verified list in the TOVG 188 .
- GCUs that are approaching the time at which a garbage collection operation may be suitable such as after the GCU has been filled with data and/or has reached a certain aging limit, etc., may be selected for evaluation on the basis that it can be expected that a garbage collection operation may be necessary in the relatively near future.
- FIG. 6 further shows the GCU management circuit 180 to include a garbage collection scheduler circuit 192 .
- This circuit 192 generally operates once it is appropriate to consider performing a garbage collection operation, at which point the circuit 192 selects from among the available verified GCUs from the table 188 .
- the circuit 192 may generate a time of completion estimate to complete the garbage collection operation based on the size of the GCU, the amount of data to be relocated, etc.
- a garbage collection operation can include accessing the forward map and/or reverse directory 182 , 184 to identify the still valid data blocks, the reading out and temporary storage of such blocks in a local buffer memory, the writing of the blocks to a new location such as in a different GCU, the application of an erasure operation to erase each of the erasure blocks in the GCU, the updating of program/erase count metadata to indicate the most recent erasure cycle, and the placement of the reset GCU into an allocation pool awaiting subsequent allocation and use for the storage of new data sets.
- FIG. 7 shows a number of die sets 200 that may be arranged across the SSD 110 in some embodiments.
- Each set 200 may have the same nominal data storage capacity (e.g., the same number of allocated dies, etc.), or each may have a different storage capacity.
- the storage capacity of each die set 200 is arranged into a number of GCUs 154 as shown.
- a separate TOVG (table of verified GCUs) 188 may be maintained by and in each die set 200 to show the status of the respective GCUs.
- the table 188 can be consulted to select a GCU that, with a high degree of probability, can be subjected to an efficient garbage collection operation without any unexpected delays due to mismatches in the metadata (forward map and reverse directory).
- FIG. 8 further shows the GCU management circuit 190 to include a garbage collection scheduler circuit 202 .
- This circuit 202 generally operates once it is appropriate to consider performing a garbage collection operation, at which point the circuit 202 selects from among the available verified GCUs from the table 198 .
- the circuit 202 may generate a time of completion estimate to complete the garbage collection operation based on the size of the GCU, the amount of data to be relocated, etc.
- a garbage collection operation can include accessing the forward map and/or reverse directory 192 , 194 to identify the still valid data blocks, the reading out and temporary storage of such blocks in a local buffer memory, the writing of the blocks to a new location such as in a different GCU, the application of an erasure operation to erase each of the erasure blocks in the GCU, the updating of program/erase count metadata to indicate the most recent erasure cycle, and the placement of the reset GCU into an allocation pool awaiting subsequent allocation and use for the storage of new data sets.
- FIG. 9 shows a number of die sets 210 that may be arranged across the SSD 110 in some embodiments.
- Each set 210 may have the same nominal data storage capacity (e.g., the same number of allocated dies, etc.), or each may have a different storage capacity.
- the storage capacity of each die set 210 is arranged into a number of GCUs 154 as shown.
- a separate TOVG (table of verified GCUs) 198 may be maintained by and in each die set 210 to show the status of the respective GCUs.
- the table 198 can be consulted to select a GCU that, with a high degree of probability, can be subjected to an efficient garbage collection operation without any unexpected delays due to mismatches in the metadata (forward map and reverse directory).
- FIG. 10 shows a functional block representation of additional aspects of the SSD 110 .
- the core CPU 116 from FIG. 2 is shown in conjunction with a code management engine (CME) 212 that can be used to manage the generation of the respective code words and outer code parity values for both standard and non-standard parity data sets
- CME code management engine
- MUs 160 ( FIG. 3 ) which are placed into a non-volatile write cache 214 which may be flash memory or other form(s) of non-volatile memory.
- the MUs are transferred to the DMM circuit 134 for writing to the flash memory 142 in the form of code words 172 as described above.
- read operations one or more pages of data are retrieved to a volatile read buffer 216 for processing prior to transfer to the host.
- the CME 212 determines the appropriate inner and outer code rates for the data generated and stored to memory.
- the DMM circuit 134 may generate both the inner and outer codes.
- the DMM circuit 134 generates the inner codes (see e.g., LDPC circuit 146 in FIG. 2 ) and the core CPU 116 generates the outer code words.
- the same processor/controller circuit generates both forms of code words. Other arrangements can be used as well.
- the CME 212 establishes appropriate code rates for both types of code words.
- a parity buffer 218 may be used to successively XOR each payload being written during each pass through the dies. Both payload data 220 and map data 222 will be stored to flash 142 .
- FIG. 11 plots example operational data for a data storage system configured and operated in accordance with various embodiments to improve data read performance during deterministic windows.
- read latency is charted over time involving deterministic window (DW) and non-deterministic window (NDW) intervals.
- DW deterministic window
- NDW non-deterministic window
- read latency, as indicated by solid line 234 of a plurality of reads to different portions of a memory is maintained within a relatively tight range 236 , which corresponds with data read consistency over time.
- different data read performance metrics such as error rate and overall time to return data to a host, can be used in substitution of, or in combination to, the read latency of FIG. 11 with similarly tight ranges 236 , and approximately uniform consistency, of read performance being maintained.
- the tight consistency of data reads during the DW can be, at least partially, contributed to background data maintenance operations and/or data writes being reduced or suspended.
- a DW interval is followed by one or more NDW intervals, such as interval 238 , where pending data writes and background data maintenance operations are carried out along with data reads.
- NDW intervals such as interval 238
- pending data writes and background data maintenance operations are carried out along with data reads.
- the second NDW 240 shows how data accesses and data maintenance operations are not consistent and can be considered random compared to the tight range 306 of data latency performance the DW intervals 232 and 242 . It is noted that the consistent performance for the first DW interval 232 is at a different latency value than the second DW interval 242 . Hence, consistency is prioritized throughout a DW interval regardless of the latency value that is consistently provided to a host, even at the expense of providing less than the fastest possible read performance. In other words, predictable read latency, and performance, are emphasized during a DW interval even if that means providing higher read latency than possible.
- FIG. 12 is a block representation of portions of an example data storage system 250 in which various embodiments may be practiced.
- a number of data storage devices 252 can be connected to one remote hosts 254 as part of a distributed network that generates, transfers, and stores data as requested by the respective hosts 254 .
- each remote host 254 is assigned a single logical die set 256 , which can be some, or all, of a data storage device 252 memory, such as an entire memory die 258 of a portion of a memory die, like a plane of a memory die.
- one or more data queues can be utilized to temporarily store data, and data access commands, awaiting storage in a die set 256 or awaiting delivery from a die set 256 .
- an example data storage system 270 is displayed that illustrates how increased scale distributed data storage networks make quality of service difficult.
- the data storage system 270 has at least one local controller 272 that organizes, activates, and monitors data accesses between a plurality of different remote hosts 274 and different logical die sets 276 .
- any number of data queues 278 can be logically positioned between the hosts 274 and die sets 276 , without limitation.
- the assorted die sets 276 can be arranged alone, or in combination with other die sets 276 , in a memory die of a data storage device.
- write amplification can occur where the actual amount of data written to a die set 276 is greater than the amount requested by a host 274 .
- Write amplification can be exacerbated by loss of data capacity and/or overprovisioned portions of a die set 276 , such as through failures, errors, or other memory occupying activity.
- one or more buffers 280 can be utilized for temporary data storage while portions of a die set 276 are repaired to overcome write amplification, but such practice can be relatively slow and occupy precious processing capabilities that may degrade overall data access performance for the data storage system 270 .
- a data storage system 270 can experience errors, faults, conflicts, and write amplification that temporarily, or permanently, degrade system 270 performance.
- the local controller 272 can be configured to manage and resolve conditions and situations that degrade system 270 performance, such resolution can result in dissimilar data access performance characteristics being experienced by the respective hosts 274 .
- the advent of high data access frequency hosts has emphasized consistent and nominally even data access performance across different die sets 276 and hosts 274 throughout the data storage system 270 . Accordingly, various embodiments are directed to providing increasingly uniform data access performance for a data storage system 270 regardless of the events that alter and/or degrade data access performance between a die set 276 and a host 274 .
- FIG. 14 depict a block representation of an example quality of service module 290 configured and operated in accordance with some embodiments to balance data access performance across different die sets and hosts.
- the quality of service module 290 can intelligently utilize constituent circuitry to track performance metrics of executed tasks between multiple different die sets and hosts in a distributed data storage system to reactively and proactively optimize data access requests to concurrently balance data access performance across the system.
- the quality of service module 290 can utilize a controller 292 , such as a microprocessor or programmable circuitry generally represented by controller 272 of FIG. 13 , to direct activity of various circuitry.
- controller 292 such as a microprocessor or programmable circuitry generally represented by controller 272 of FIG. 13 , to direct activity of various circuitry.
- real-time data storage system performance metrics such as latency, error rate, overall time to service a host request, number of background operations triggered, overall queue input-output frequency, and deterministic window interval activation, can be measured and/or detected with a monitor circuit 294 .
- the monitor circuit 294 may maintain a log 296 of detected die set and host activity in local memory in order to allow a prediction circuit 298 of the module 290 to identify patterns and consequential data access tasks.
- the prediction circuit 298 can utilize model data from other data storage systems and/or past logged activity from the present system to predict what tasks are likely to arrive in a die set queue as well as how long each task will take to execute in various die sets of a distributed data storage system in view of the present system conditions.
- the prediction circuit 298 can employ machine learning to improve the accuracy of forecasted background operations, read accesses, and write accessed, as well as the performance of those forecasted tasks, based on real-time tracked executions from the monitor circuit 294 . It is contemplated the prediction circuit 298 can generate an accuracy value for forecasted tasks, and/or forecasted performance, and only provide those predictions that are above a predetermined accuracy threshold, such as 90% confidence.
- the ability to predict future tasks and their respective execution times to numerous different die sets with the prediction circuit 298 allows the quality of service module 290 to organize existing tasks so that future tasks do not inhibit or degrade consistent read access latency during deterministic window intervals.
- Knowledge of past executed tasks to a die set attained with the monitor circuit 294 and the accurate prediction of future pending tasks and their execution times allows a scheduler circuit 300 of the module 290 to customize existing queued tasks to at least one die set to optimize future data storage system operation. Queue customization is not limited to a particular action, but is contemplated that the scheduler circuit 300 correlates certain tasks to available system processing bandwidth, prioritizes the longest tasks to execute, prioritizes the shortest tasks to execute, and/or generates background operations out-of-turn.
- the quality of service module 290 can utilize a test circuit 302 to carry out one or more data access operations to at least one portion of any die set to collect operational data that can increase the accuracy and speed of the monitor 294 and prediction 298 circuits. That is, one or more test patterns of data reads and/or data writes can be conducted to one or more different die sets with the test circuit 300 to verify measurements by the monitor circuit 294 , test for un-monitored performance characteristics, such as memory cell settling, write amplification, or environmental conditions, and measure the data access performance of less than all of a die set.
- a throttle circuit 304 can resolve such issues by altering a queued task to manipulate the task's execution performance. For instance, the throttle circuit 304 may split a task into two separately executed tasks, utilize less than all available system resources to execute a task, or deliberately delay a task during execution to control when a task completes. Such control of queued task execution performance can be particularly emphasized during DW intervals. Accordingly, the quality of service module 290 has a DW circuit 306 that can operate alone, and other circuits, to choose and/or manipulate pending die set tasks to ensure consistent data access performance for each die set and host of a data storage system throughout the guaranteed interval time period.
- a data storage system controller may compile measured and/or predicted system information together in order to generate a strategy for establishing and maintaining balanced data access performance for each die set and host. Such compiled information is conveyed in FIG. 14 as a dashboard 308 . It is noted that the dashboard 308 is merely exemplary and in no way limits the possible information compiled by a data storage system controller.
- the example dashboard 308 has several different real-time metrics 310 measured by at least one monitor circuit 294 and several different predicted metrics 312 forecasted by at least one prediction circuit 298 .
- the real-time metrics 310 may be average latency 314 (read and/or write), error rate 316 , read-write ratio 318 , and I/O frequency 320 while the predicted metrics 312 may be read time to host 322 , write request completion time 324 , number of future background operations 326 , and average read latency 328 .
- Other real-time 310 and/or predicted 312 metrics can be computed by a system controller that are displayed, or not displayed, on the dashboard 308 .
- the metrics allow for real-time operational information to be calculated and displayed.
- real-time execution times for read requests 330 , write requests 332 , and background operations 334 can represent current, measured access to some, or all, of a data storage system.
- the displayed execution times 330 / 332 / 334 may be statistics for a single data access operation or an average of multiple accesses, such as the immediate past ten data reads, data writes, or background operations.
- a controller can compute a single, or average, read access request execution time 336 while in DW interval conditions and single, or average, execution times to complete a read 338 , write 340 , or background operation 342 during NDW interval conditions.
- the various predicted DW and NDW data access execution times can allow a scheduler circuit to intelligently choose which queued data access operations to execute in order to prepare a die set for more consistent DW interval performance. Such execution selection may involve reorganizing queued data access commands or changing the queue execution order without rewriting the queue.
- the ability to predict execution times for data accesses based on actual, detected operations to a die set allows the dashboard to be highly accurate and precise, which corresponds with optimized deterministic I/O for one or more die sets of a data storage system.
- various embodiments utilize the quality of service module 290 to generate and test a plurality of different hypothetical future data access scenarios in an effort to develop one or more proactive strategies for preventing deviation from a nominally consistent data access performance for each die set and host.
- a scheduler circuit can input computed max data access execution time 344 , minimum data access execution time 346 , and risk of deviation over time based on write amplification, errors, and conflicts 348 to proactively modify one or more die set queues and pending data access requests to prevent a predicted event, such as a loss of data capacity in at least one die set, and maintain nominally consistent data access performance throughout the data storage system.
- the predicted events and proactive modifications can be complex, particularly when some die sets are in a DW interval while other intervals are in NDW intervals.
- the quality of service module 290 may continually operate to provide multiple different proactive data access modification strategies that can be selected depending on the DW/NDW configuration of the various die sets and hosts of a data storage system. That is, the quality of service module 290 can generate multiple different predicted system events that depend on the number, frequency, and length of DW intervals requested by the hosts of the data storage system.
- FIG. 15 conveys a static reactive data access process 360 that can be carried out by the quality of service module 290 in a distributed data storage system 270 in accordance with some embodiments.
- the process 360 can begin with the generation of at least one proactive data access modification strategy in step 362 that is at least partially executed in step 364 before an unpredicted system condition is encountered in step 366 .
- Such unpredicted condition may be a memory cell, die set, die, or data storage device failure that necessitates additional system resources, such as power, buffer usage, processing power, or other available resource, to maintain existing data access performance.
- the quality of service module can establish one or more thresholds that correspond with how drastic the unpredicted condition will be to resolve.
- a threshold may be actions that are predicted to be needed to return to a nominally consistent die set/host data access performance for the system, such as an amount of time, a percentage of available system resources, and/or a likelihood of returning to prior data access performance levels.
- Decision 368 evaluates the established thresholds compared to the predicted results of the unpredicted condition(s).
- step 370 is triggered to conduct a predetermined action to mitigate the severity of non-uniformity amongst the various system die set data access performance.
- the predetermined action of step 370 is not limited to a particular routine, but can involve diverting pending data access commands of a problematic die set to a system buffer where the destination of the various data accesses are rebalanced among the other die sets of the system.
- Other aspects of step 370 may be executing only read access requests while a problematic die set is repaired and/or pending data writes are reassigned by a scheduler circuit to other die sets of the system.
- FIG. 16 depicts an example dynamic reactive data access process 380 that can be carried out by the quality of service module 290 in a distributed data storage system 270 .
- steps 382 , 384 , and 386 generate a proactive strategy, execute the strategy, and subsequently encounter one or more unpredicted events.
- Step 388 evaluates the severity of the unpredicted event, much like decision 368 , but process 380 responds to an unpredicted event with a relatively small exertion of system resources in step 390 before returning to decision 388 .
- step 390 may simply slow the execution of data access commands to a problematic die set in order to give a system controller time to rebalance and/or repair the unpredicted event.
- step 388 may buffer less than all of the pending data access requests from a queue to minimize the redirection of access requests.
- step 390 is activated again with an increased amount/volume of system resources being utilized to mitigate, and/or repair, the existing unpredicted event.
- step 390 is activated again with an increased amount/volume of system resources being utilized to mitigate, and/or repair, the existing unpredicted event.
- FIG. 17 is a flowchart of an example balancing routine 400 that can be employed in a data storage system by a quality of service module in accordance with various embodiments.
- step 402 populates die set queues with data access tasks to be performed in step 404 .
- step 404 can execute tasks concurrently, and/or sequentially, in some, or all, of the die sets of a data storage system.
- Step 406 tracks the real-time performance of the various data access task executions to compile system metrics that can be useful to a prediction circuit to predict future data accesses that are likely to enter the assorted system queues in step 408 and predict future data access execution performance in step 410 .
- Step 414 may execute one or more proactive data access modifications, such as altering data access destinations, data access storage locations, data access order in a queue, or execution time. It is contemplated that the proactive data access modification can be conducted on any die set and/or queue of a system, which may, or may not, be the die set/queue where the performance bottleneck is predicted. For instance, step 414 can throttle the data access performance of each die set unaffected by the predicted bottleneck to a performance level that is less than the capabilities of the respective die sets in order to maintain a nominal consistency of data access performance throughout the data storage system.
- decision 416 evaluates if current system quality of service, such as data access latency, error rate, and overall execution time, is adequately consistent for each die set of the system. If the system is properly consistent, the routine 400 returns to step 402 where new data access are queued and subsequently executed. If the real-time quality of service is not consistent enough, decision 416 prompts decision 418 to evaluate what, if any, die sets are in a DW interval. The presence of a DW interval can disrupt potential die set consistency resolution strategies generated by the quality of service module.
- current system quality of service such as data access latency, error rate, and overall execution time
- a DW interval is met with step 420 artificially reducing the data access performance of some die sets so that the entire data storage system experiences the same data access performance as the die set(s) in the DW interval.
- step 422 proceeds to rebalance or otherwise modify one or more die set queues to resolve the situation(s) that resulted in the inconsistent quality of service in decision 416 .
- step 424 is able to bring the data access performance of each of the die sets of the system up to an unthrottled maximum consistent level.
- step 422 rebalances one or more die set queues as quickly as possible so that consistent quality of service returns before a DW interval is requested from a host. That is, step 422 sacrifices short-term system consistency to ensure the system can service a DW interval request and provide long-term data access performance uniformity.
- a quality of service module can be utilized in non-semiconductor data storage, such as tape and rotating magnetic media.
- a plurality of logical die sets can be optimized with a quality of service module tracking data access execution performance metrics.
- the utilization of the tracked execution performance metrics to predict future read, write, and background operation execution performance allows the resource module to intelligently choose and execute pending die set access commands out of queued order to optimize current and/or future performance.
- the ability to employ current command execution performance to predict both DW and NDW interval performance allows for current command execution that results in future performance improvements.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Microelectronics & Electronic Packaging (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
Abstract
Description
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/023,420 US11016679B2 (en) | 2018-06-29 | 2018-06-29 | Balanced die set execution in a data storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/023,420 US11016679B2 (en) | 2018-06-29 | 2018-06-29 | Balanced die set execution in a data storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
US20200004443A1 US20200004443A1 (en) | 2020-01-02 |
US11016679B2 true US11016679B2 (en) | 2021-05-25 |
Family
ID=69055181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/023,420 Active 2038-08-03 US11016679B2 (en) | 2018-06-29 | 2018-06-29 | Balanced die set execution in a data storage system |
Country Status (1)
Country | Link |
---|---|
US (1) | US11016679B2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11301376B2 (en) * | 2018-06-11 | 2022-04-12 | Seagate Technology Llc | Data storage device with wear range optimization |
US20220245269A1 (en) * | 2021-01-29 | 2022-08-04 | Seagate Technology Llc | Data storage system with decentralized policy alteration |
US20230199454A1 (en) * | 2021-12-20 | 2023-06-22 | Qualcomm Incorporated | Capabilities and configuration for nondata wireless services |
US12282685B2 (en) | 2022-10-24 | 2025-04-22 | Samsung Electronics Co., Ltd. | Computational storage device, method for operating the computational storage device and method for operating host device |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11762680B2 (en) * | 2020-10-20 | 2023-09-19 | Alilbaba Group Holding Limited | Method and system of host resource utilization reduction |
US11487446B2 (en) | 2020-12-03 | 2022-11-01 | Western Digital Technologies, Inc. | Overhead reduction in data transfer protocol for NAND memory |
US11586384B2 (en) | 2021-02-16 | 2023-02-21 | Western Digital Technologies, Inc. | Overhead reduction in data transfer protocol for data storage devices |
CN113778679B (en) * | 2021-09-06 | 2023-03-10 | 抖音视界有限公司 | Resource scheduling method, resource scheduling device, electronic device and readable storage medium |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6401120B1 (en) | 1999-03-26 | 2002-06-04 | Microsoft Corporation | Method and system for consistent cluster operational data in a server cluster using a quorum of replicas |
US6507183B1 (en) * | 1999-07-02 | 2003-01-14 | Stmicroelectronics S.R.L. | Method and a device for measuring an analog voltage in a non-volatile memory |
US8046558B2 (en) | 2005-09-16 | 2011-10-25 | The Research Foundation Of State University Of New York | File system having predictable real-time performance |
US8233303B2 (en) | 2006-12-14 | 2012-07-31 | Rambus Inc. | Multi-die memory device |
US8238911B2 (en) | 2007-11-02 | 2012-08-07 | Qualcomm Incorporated | Apparatus and methods of configurable system event and resource arbitration management |
US9262163B2 (en) | 2012-12-29 | 2016-02-16 | Intel Corporation | Real time instruction trace processors, methods, and systems |
US9449018B1 (en) | 2013-11-25 | 2016-09-20 | Google Inc. | File operation task optimization |
US20160357474A1 (en) * | 2015-06-05 | 2016-12-08 | Sandisk Technologies Inc. | Scheduling scheme(s) for a multi-die storage device |
US9626274B2 (en) | 2014-12-23 | 2017-04-18 | Intel Corporation | Instruction and logic for tracking access to monitored regions |
US20180165016A1 (en) * | 2016-12-08 | 2018-06-14 | Western Digital Technologies, Inc. | Read tail latency reduction |
US20180188970A1 (en) * | 2016-12-30 | 2018-07-05 | Western Digital Technologies, Inc. | Scheduling access commands for data storage devices |
US20190114276A1 (en) * | 2017-10-18 | 2019-04-18 | Western Digital Technologies, Inc. | Uniform performance monitor for a data storage device and method of operation |
US10387340B1 (en) * | 2017-03-02 | 2019-08-20 | Amazon Technologies, Inc. | Managing a nonvolatile medium based on read latencies |
US20190332320A1 (en) * | 2018-04-26 | 2019-10-31 | Phison Electronics Corp. | Data writing method, memory control circuit unit and memory storage device |
-
2018
- 2018-06-29 US US16/023,420 patent/US11016679B2/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6401120B1 (en) | 1999-03-26 | 2002-06-04 | Microsoft Corporation | Method and system for consistent cluster operational data in a server cluster using a quorum of replicas |
US6507183B1 (en) * | 1999-07-02 | 2003-01-14 | Stmicroelectronics S.R.L. | Method and a device for measuring an analog voltage in a non-volatile memory |
US8046558B2 (en) | 2005-09-16 | 2011-10-25 | The Research Foundation Of State University Of New York | File system having predictable real-time performance |
US8233303B2 (en) | 2006-12-14 | 2012-07-31 | Rambus Inc. | Multi-die memory device |
US8238911B2 (en) | 2007-11-02 | 2012-08-07 | Qualcomm Incorporated | Apparatus and methods of configurable system event and resource arbitration management |
US9262163B2 (en) | 2012-12-29 | 2016-02-16 | Intel Corporation | Real time instruction trace processors, methods, and systems |
US9449018B1 (en) | 2013-11-25 | 2016-09-20 | Google Inc. | File operation task optimization |
US9626274B2 (en) | 2014-12-23 | 2017-04-18 | Intel Corporation | Instruction and logic for tracking access to monitored regions |
US20160357474A1 (en) * | 2015-06-05 | 2016-12-08 | Sandisk Technologies Inc. | Scheduling scheme(s) for a multi-die storage device |
US20180165016A1 (en) * | 2016-12-08 | 2018-06-14 | Western Digital Technologies, Inc. | Read tail latency reduction |
US20180188970A1 (en) * | 2016-12-30 | 2018-07-05 | Western Digital Technologies, Inc. | Scheduling access commands for data storage devices |
US10387340B1 (en) * | 2017-03-02 | 2019-08-20 | Amazon Technologies, Inc. | Managing a nonvolatile medium based on read latencies |
US20190114276A1 (en) * | 2017-10-18 | 2019-04-18 | Western Digital Technologies, Inc. | Uniform performance monitor for a data storage device and method of operation |
US20190332320A1 (en) * | 2018-04-26 | 2019-10-31 | Phison Electronics Corp. | Data writing method, memory control circuit unit and memory storage device |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11301376B2 (en) * | 2018-06-11 | 2022-04-12 | Seagate Technology Llc | Data storage device with wear range optimization |
US20220245269A1 (en) * | 2021-01-29 | 2022-08-04 | Seagate Technology Llc | Data storage system with decentralized policy alteration |
US11514183B2 (en) * | 2021-01-29 | 2022-11-29 | Seagate Technology Llc | Data storage system with decentralized policy alteration |
US20230199454A1 (en) * | 2021-12-20 | 2023-06-22 | Qualcomm Incorporated | Capabilities and configuration for nondata wireless services |
US12207169B2 (en) * | 2021-12-20 | 2025-01-21 | Qualcomm Incorporated | Capabilities and configuration for nondata wireless services |
US12282685B2 (en) | 2022-10-24 | 2025-04-22 | Samsung Electronics Co., Ltd. | Computational storage device, method for operating the computational storage device and method for operating host device |
Also Published As
Publication number | Publication date |
---|---|
US20200004443A1 (en) | 2020-01-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11150836B2 (en) | Deterministic optimization via performance tracking in a data storage system | |
US10817217B2 (en) | Data storage system with improved time-to-ready | |
US11016679B2 (en) | Balanced die set execution in a data storage system | |
US10776263B2 (en) | Non-deterministic window scheduling for data storage systems | |
US11899952B2 (en) | Lossless namespace metadata management system | |
US11132133B2 (en) | Workload-adaptive overprovisioning in solid state storage drive arrays | |
US9063874B2 (en) | Apparatus, system, and method for wear management | |
US9129699B2 (en) | Semiconductor storage apparatus and method including executing refresh in a flash memory based on a reliability period using degree of deterioration and read frequency | |
CN108984429B (en) | Data storage device with buffer occupancy management | |
US11848055B2 (en) | Asynchronous access multi-plane solid-state memory | |
US20220147279A1 (en) | Heat management solid-state data storage system | |
US20150067415A1 (en) | Memory system and constructing method of logical block | |
US10929025B2 (en) | Data storage system with I/O determinism latency optimization | |
US10783982B2 (en) | Probation bit for data storage memory | |
US12135895B2 (en) | Hot data management in a data storage system | |
US20200409874A1 (en) | Data storage system data access arbitration | |
US11307768B2 (en) | Namespace auto-routing data storage system | |
US10872015B2 (en) | Data storage system with strategic contention avoidance | |
US20210279188A1 (en) | Client input/output (i/o) access rate variation compensation | |
US11301376B2 (en) | Data storage device with wear range optimization | |
US11256621B2 (en) | Dual controller cache optimization in a deterministic data storage system | |
US12086455B2 (en) | Data storage system with workload-based asymmetry compensation | |
US12019898B2 (en) | Data storage system with workload-based dynamic power consumption | |
US11810625B2 (en) | Solid-state memory with intelligent cell calibration | |
CN114816828A (en) | Automatic tuning of firmware parameters for memory systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: SEAGATE TECHNOLOGY LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SECATCH, STACEY;CLAUDE, DAVID W.;REEL/FRAME:046612/0175 Effective date: 20180801 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |