US20140188821A1 - Method and System to Avoid Space Bloating During Run-Time Compression - Google Patents

Method and System to Avoid Space Bloating During Run-Time Compression Download PDF

Info

Publication number
US20140188821A1
US20140188821A1 US13/730,207 US201213730207A US2014188821A1 US 20140188821 A1 US20140188821 A1 US 20140188821A1 US 201213730207 A US201213730207 A US 201213730207A US 2014188821 A1 US2014188821 A1 US 2014188821A1
Authority
US
United States
Prior art keywords
data
database system
data page
indication
during
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US13/730,207
Other versions
US8965857B2 (en
Inventor
Panfeng ZHOU
Katsunori Terada
Yanhong Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sybase Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US13/730,207 priority Critical patent/US8965857B2/en
Assigned to SAP AG reassignment SAP AG ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TERADA, KATSUNORI, WANG, YANHONG, ZHOU, PANFENG
Assigned to SYBASE, INC. reassignment SYBASE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SAP AG
Publication of US20140188821A1 publication Critical patent/US20140188821A1/en
Application granted granted Critical
Publication of US8965857B2 publication Critical patent/US8965857B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • G06F17/30153
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Definitions

  • the present invention relates generally to databases and, more specifically, to improvements in efficiency for space utilization in a database.
  • Data compression and reorganization may allow for better use of limited database space available.
  • data compression and reorganization generally requires locking access to large segments of a database (e.g., table lock) for relatively long periods of time (e.g., by use of the reorg+rebuild SQL command combination).
  • An exemplary embodiment includes systems, methods, and computer program products for managing a database system, the method including locking during a database system idle time access by the database system to a data page of a data allocation unit, compressing during the database system idle time a data stored in the locked data page, and recording during the database system idle time an indication that the compressed and locked data page includes free storage space, wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page.
  • the data page may be compressed during idle time and the space freed therein may be used during a subsequent run time without the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild SQL command combination).
  • FIG. 1 illustrates a database management system, in accordance with an exemplary embodiment of the present invention.
  • FIG. 2 is a flowchart illustrating steps in accordance with an exemplary embodiment of the present invention.
  • FIG. 3 is a flowchart illustrating steps in accordance with another exemplary embodiment of the present invention.
  • FIG. 4 is a diagram illustrating the relationship of exemplary space management components, in accordance with an exemplary embodiment of the present invention.
  • FIG. 5 depicts an exemplary computer system in which embodiments of the present invention may be implemented.
  • modules in this specification and the claims means any combination of hardware or software components for performing the indicated function.
  • a module need not be a rigidly defined entity, such that several modules may overlap hardware and software components in functionality.
  • a software module may refer to a single line of code within a procedure, the procedure itself being a separate software module.
  • One skilled in the relevant arts will understand that the functionality of modules may be defined in accordance with a number of stylistic or performance-optimizing techniques, for example.
  • FIG. 1 illustrates a database management system (“DBMS”) 100 , in accordance with an exemplary embodiment of the present invention.
  • DBMS 100 comprises a database server component 102 configured to run database server software.
  • this database server software comprises the Adaptive Server Enterprise (“ASE”) Relational Database Management Server (“RDBMS”) software for high performance on-line transaction processing (“OLTP”) applications developed by Sybase, Inc.
  • DBMS 100 further comprises storage 104 , on which database 106 is stored.
  • Database 106 comprises data that can be served by DBMS 100 to one or more clients of the system.
  • Database 106 includes a table, such as table 108 , for storing specific user data.
  • Table 108 includes a plurality of data pages, such as data page 110 .
  • a reorg+rebuild command combination in combination with compression, may yield a desirable compression ratio (allocated storage space/compressed stored data).
  • these commands may be used to reconfigure a table's index, data, or both, to reduce or eliminate fragmentation which, combined with compression, yield an optimal amount of free space.
  • the reorg+rebuild command combination may be too computationally expensive for periodic or frequent use.
  • a page identifier corresponding to the data page is stored at a predetermined data structure (not shown) to indicate that data compression may be performed in the data page during a database system idle time.
  • the data page instead of the entire database 106 or the entire table 108
  • the page identifier corresponding to the data page is stored such that the space freed by the compression is available for storage of user data during a subsequent database system run time.
  • FIG. 2 is a flowchart 200 illustrating a method for freeing storage space in a data page during system idle time, according to an exemplary embodiment.
  • one or more computing devices of a database system determine if the database server 102 is in idle mode. If the database server 102 is in idle mode, then the method proceeds to step 210 where the one or more computing devices lock a data page, such as data page 102 illustrated in FIG. 1 . Locking a data page relates to restricting user access through database server 110 , or through any other device logically coupled to the database system, for reading data from and/or writing data to the particular data page.
  • the one or more computing devices may retain access to the particular data page for operations other than user access such as, for example, data compression.
  • the one or more computing devices compress some or all of the data stored in data page 110 .
  • data compression may be performed using one or more generally known data compression algorithms without departing from the scope of the present invention.
  • the one or more computing devices record an indication that the compressed data page includes free storage space.
  • a data page may be identified during the database system run time for compression during database system idle time, locked during the database system idle time to prevent reading from and/or writing into the data page by a user during a compression operation (e.g., by a user application), compressed, and identified as including freed storage space that may be used to store additional data during a future database system run time.
  • a compression operation e.g., by a user application
  • freed storage space that may be used to store additional data during a future database system run time.
  • FIG. 3 is a flowchart 300 illustrating a method according to another exemplary embodiment of the present invention.
  • one or more computing devices of a database system determine if a current data page (i.e., a data page currently used by the one or more computing devices to store user data), such as data page 102 illustrated in FIG. 1 , is fall and may be set for compression during a subsequent database system idle mode (e.g., as part of an idle mode background housekeeping process). If the current data page is full, at step 310 the one or more computing devices store an indication that the current data page is full.
  • a current data page i.e., a data page currently used by the one or more computing devices to store user data
  • a subsequent database system idle mode e.g., as part of an idle mode background housekeeping process
  • the indication may be in the form of an identifier associated with the current data page or the position of a field in a data structure, but the present invention is not so limited, and a person having ordinary skill in the art will recognize that there may be various ways to store this information without departing from the scope of the present teachings.
  • the one or more computing devices determine if the database server 102 is in idle mode. If the database server 102 is in idle mode, in step 320 the one or more computing devices lock the data page based on the indication that the data page is full as set forth in step 310 . At step 325 , the one or more computing devices compress some or all of the data stored in data page 110 . At step 330 , the one or more computing devices record an indication that the compressed data page includes free storage space and clear the indication that the data page is full.
  • the indication may be in the form of an identifier associated with the compressed data page or the position of a field in a data structure, but the present invention is not so limited, and a person having ordinary skill in the art will recognize that there may be various ways for store this information without departing from the scope of the present teachings.
  • the one or more computing devices determine if the database server 102 is in run time mode (i.e., the database server can read and/or write data to the data pages of the database system). If the database server is in run time mode and the database server needs to store data in a new data page (i.e., a current data page is full), in step 340 the one or more computing devices store new user data in the recently-compressed data page based on the indication that the data page 102 includes free space.
  • a data page that becomes full may be identified during the database system run time, locked during the database system idle time to prevent reading from and/or writing into the data page by a user application during a compression operation, compressed, and identified as including freed storage space that may be used to store additional data during a subsequent database system run time.
  • the database server determines if a previously used and compressed data page includes free space. If a previously used and compressed data page includes free space, the new user data is stored in the free space. If, on the other hand, no previously used data page includes free space (e.g., no compression has taken place or spaced freed by a previous compression has been already used) then a new data page is allocated to store the new user data.
  • a full data page may be compressed during idle time and the space freed by the compression may be used during a subsequent run time without the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild command combination).
  • data pages other than a data page being compressed within the corresponding table may be accessible to the database server 102 , as the table is not locked for compression and reorganization (as is the case in the reorg rebuild command combination).
  • FIG. 4 is a diagram 400 illustrating the relationship of some of the aforementioned exemplary memory management components, in accordance with an exemplary embodiment of the present invention.
  • a database 402 has a hash table 404 across a plurality of partitions 406 .
  • each partition includes a partition descriptor (PDES) 408 for describing the contents of its corresponding partition and one or more allocation units 410 , which are shown in FIG. 4 as having an exemplary 256 pages each.
  • PDES partition descriptor
  • Each allocation unit 410 shown on the right portion of FIG. 5 , has an allocation page 412 used to indicate which pages 414 of the allocation unit are free or used.
  • the allocation unit has a plurality of data pages 414 over a contiguous storage space.
  • an array of page identifiers (PID_CMP_ARRAY) is included in the PDES of each partition.
  • PID_CMP_ARRAY As illustrated in FIG. 2 and FIG. 3 , during a database system idle time, pages are compressed and a corresponding indication is stored. The corresponding indication may be stored in PID_CMP_ARRAY as, for example, a page identifier or a bit in a position of the array corresponding to the page identifier for the compressed page, without departing from the scope of the present invention.
  • the database server determines if a previously used and compressed data page includes free space by reading PID_CMP_ARRAY.
  • PID_CMP_ARRAY indicates that a previously used and compressed data page includes free space
  • a data page identified in PID_CMP_ARRAY becomes the current data page
  • the data page identifier is cleared from PID_CMP_ARRAY
  • the new user data is stored in the free space of the new current data page (i.e., data page identified in PID_CMP_ARRAY).
  • no previously used data page includes free space (i.e., PID_CMP_ARRAY is empty) then a new data page is allocated to store the new user data.
  • FIG. 5 illustrates an exemplary computer system 500 in which exemplary embodiments of the present invention, or portions thereof, can be implemented as computer-readable code.
  • FIGS. 2 and 300 of FIG. 3 can be implemented in system 500 .
  • FIG. 5 illustrates an exemplary computer system 500 in which exemplary embodiments of the present invention, or portions thereof, can be implemented as computer-readable code.
  • the methods illustrated by flowcharts 200 of FIGS. 2 and 300 of FIG. 3 can be implemented in system 500 .
  • Various embodiments of the invention are described in terms of this example computer system 500 . After reading this description, it will become apparent to a person skilled in the relevant art how to implement the invention using other computer systems and/or computer architectures.
  • Computer system 500 includes one or more processors, such as processor 504 .
  • Processor 504 can be a special purpose or a general purpose processor.
  • Processor 504 is connected to a communication infrastructure 506 (for example, a bus or network).
  • Computer system 500 also includes a main memory 508 , preferably random access memory (RAM), and may also include a secondary memory 510 .
  • Secondary memory 510 may include, for example, a hard disk drive 512 , a removable storage drive 514 , and/or a memory stick.
  • Removable storage drive 514 may comprise a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, or the like.
  • the removable storage drive 514 reads from and/or writes to a removable storage unit 518 in a well-known manner.
  • Removable storage unit 518 may comprise a floppy disk, magnetic tape, optical disk, etc. that is read by and written to by removable storage drive 514 .
  • removable storage unit 518 includes a computer usable storage medium having stored therein computer software and/or data.
  • secondary memory 510 may include other similar means for allowing computer programs or other instructions to be loaded into computer system 500 .
  • Such means may include, for example, a removable storage unit 522 and an interface 520 .
  • Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 522 and interfaces 520 that allow software and data to be transferred from the removable storage unit 522 to computer system 500 .
  • Computer system 500 may also include a communications interface 524 .
  • Communications interface 524 allows software and data to he transferred between computer system 500 and external devices.
  • Communications interface 524 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like.
  • Software and data transferred via communications interface 524 are in the form of signals that may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 524 . These signals are provided to communications interface 524 via a communications path 526 .
  • Communications path 526 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link or other communications channels.
  • computer program medium and “computer usable medium” are used to generally refer to media such as removable storage unit 518 , removable storage unit 522 , and a hard disk installed in hard disk drive 512 . Signals carried over communications path 526 can also embody the logic described herein. Computer program medium and computer usable medium can also refer to memories, such as main memory 508 and secondary memory 510 , which can be memory semiconductors (e.g. DRAMs, etc.). These computer program products are means for providing software to computer system 500 .
  • Computer programs are stored in main memory 508 and/or secondary memory 510 . Computer programs may also be received via communications interface 524 . Such computer programs, when executed, enable computer system 500 to implement the present invention as discussed herein. In particular, the computer programs, when executed, enable processor 504 to implement the processes of the present invention, such as the steps in the methods illustrated by flowcharts 200 of FIG. 2 , 300 of FIG. 3 , and 400 of FIG. 4 , discussed above. Accordingly, such computer programs represent controllers of the computer system 500 . Where the invention is implemented using software, the software may be stored in a computer program product and loaded into computer system 500 using removable storage drive 514 , interface 520 , hard drive 512 or communications interface 524 .
  • the invention is also directed to computer program products comprising software stored on any computer useable medium.
  • Such software when executed in one or more data processing device, causes a data processing device(s) to operate as described herein.
  • Embodiments of the invention employ any computer useable or readable medium, known now or in the future.
  • Examples of computer useable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, optical storage devices, MEMS, nanotechnological storage device, etc.), and communication mediums (e.g., wired and wireless communications networks, local area networks, wide area networks, intranets, etc.).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Methods, systems, and computer program products are provided to manage a database system. The method includes locking during a database system idle time access by the database system to a data page of a data allocation unit, compressing during the database system idle time a data stored in the locked data page, and recording during the database system idle time an indication that the compressed and locked data page includes free storage space, wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page. Thus, the data page may be compressed during idle time and the space freed therein may be used during a subsequent run time without the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild SQL command combination).

Description

    BACKGROUND Field
  • The present invention relates generally to databases and, more specifically, to improvements in efficiency for space utilization in a database.
  • BACKGROUND
  • An important component of conventional database systems is their data compression capabilities. Data compression and reorganization may allow for better use of limited database space available. However, data compression and reorganization generally requires locking access to large segments of a database (e.g., table lock) for relatively long periods of time (e.g., by use of the reorg+rebuild SQL command combination).
  • For certain applications, such as high performance on-line transaction processing (OLTP) locking access to the database for relatively long periods of time (e.g., by use of reorg+rebuild command combination) is impractical, as fast and continuous access to the database is required. However, even for such systems, data compression and reorganization remains necessary to reduce database space, and thus, system cost.
  • SUMMARY
  • An exemplary embodiment includes systems, methods, and computer program products for managing a database system, the method including locking during a database system idle time access by the database system to a data page of a data allocation unit, compressing during the database system idle time a data stored in the locked data page, and recording during the database system idle time an indication that the compressed and locked data page includes free storage space, wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page. The data page may be compressed during idle time and the space freed therein may be used during a subsequent run time without the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild SQL command combination).
  • Further features and advantages of the present invention, as well as the structure and operation of various exemplary embodiments of the present invention, are described in detail below with reference to the accompanying drawings. It is noted that the present invention is not limited to the specific exemplary embodiments described herein. Such exemplary embodiments are presented herein for illustrative purposes only. Additional embodiments will be apparent to persons skilled in the relevant art(s) based on the teachings disclosed herein.
  • BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES
  • The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate exemplary embodiments and, together with the description, further serve to explain the principles of the disclosure and to enable a person skilled in the relevant art to make and use the disclosed and contemplated embodiments.
  • FIG. 1 illustrates a database management system, in accordance with an exemplary embodiment of the present invention.
  • FIG. 2 is a flowchart illustrating steps in accordance with an exemplary embodiment of the present invention.
  • FIG. 3 is a flowchart illustrating steps in accordance with another exemplary embodiment of the present invention.
  • FIG. 4 is a diagram illustrating the relationship of exemplary space management components, in accordance with an exemplary embodiment of the present invention.
  • FIG. 5 depicts an exemplary computer system in which embodiments of the present invention may be implemented.
  • The disclosure will now be presented with reference to the accompanying drawings. In the drawings, generally, like reference numbers indicate identical or functionally similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
  • DETAILED DESCRIPTION
  • The following detailed description of exemplary embodiments refers to the accompanying drawings that illustrate certain aspects of the disclosure. Other embodiments are possible, and modifications can be made to the exemplary embodiments within the spirit and scope of the present teachings. Therefore, the detailed description is not meant to limit the inventive concepts.
  • It would be apparent to one of skill in the art that one or more aspects of the present invention, as described below, can be implemented in many different embodiments of software, hardware, firmware, and/or the entities illustrated in the figures. Any actual software code with the specialized control of hardware to implement an exemplary embodiment of the present invention is not limiting of the invention. Thus, the operational behavior of the exemplary embodiment will be described with the understanding that modifications and variations of these embodiments are possible, and within the spirit and scope of the present invention.
  • Reference to modules in this specification and the claims means any combination of hardware or software components for performing the indicated function. A module need not be a rigidly defined entity, such that several modules may overlap hardware and software components in functionality. For example, a software module may refer to a single line of code within a procedure, the procedure itself being a separate software module. One skilled in the relevant arts will understand that the functionality of modules may be defined in accordance with a number of stylistic or performance-optimizing techniques, for example.
  • FIG. 1 illustrates a database management system (“DBMS”) 100, in accordance with an exemplary embodiment of the present invention. DBMS 100 comprises a database server component 102 configured to run database server software. In a non-limiting exemplary embodiment, this database server software comprises the Adaptive Server Enterprise (“ASE”) Relational Database Management Server (“RDBMS”) software for high performance on-line transaction processing (“OLTP”) applications developed by Sybase, Inc. DBMS 100 further comprises storage 104, on which database 106 is stored. Database 106 comprises data that can be served by DBMS 100 to one or more clients of the system. Database 106 includes a table, such as table 108, for storing specific user data. Table 108 includes a plurality of data pages, such as data page 110.
  • As it is known to those having ordinary skill in the art, a reorg+rebuild command combination, in combination with compression, may yield a desirable compression ratio (allocated storage space/compressed stored data). In particular, depending on selected options, these commands may be used to reconfigure a table's index, data, or both, to reduce or eliminate fragmentation which, combined with compression, yield an optimal amount of free space. However, for high performance OLTP applications, the reorg+rebuild command combination may be too computationally expensive for periodic or frequent use.
  • As will be explained in further detail below, in various exemplary embodiments, when a data page of database 106 (such as data page 110) is full, a page identifier corresponding to the data page is stored at a predetermined data structure (not shown) to indicate that data compression may be performed in the data page during a database system idle time. During the database system idle time the data page (instead of the entire database 106 or the entire table 108) is locked and compressed, and the page identifier corresponding to the data page (or another form of identification corresponding to the data page, such as a corresponding location in an array of bits), is stored such that the space freed by the compression is available for storage of user data during a subsequent database system run time.
  • FIG. 2 is a flowchart 200 illustrating a method for freeing storage space in a data page during system idle time, according to an exemplary embodiment. At step 205, one or more computing devices of a database system, which may be embodied in database server 102 illustrated in FIG. 1 by way of non-limiting example, determine if the database server 102 is in idle mode. If the database server 102 is in idle mode, then the method proceeds to step 210 where the one or more computing devices lock a data page, such as data page 102 illustrated in FIG. 1. Locking a data page relates to restricting user access through database server 110, or through any other device logically coupled to the database system, for reading data from and/or writing data to the particular data page. The one or more computing devices may retain access to the particular data page for operations other than user access such as, for example, data compression.
  • At step 215, the one or more computing devices compress some or all of the data stored in data page 110. Those skilled in the art would understand that data compression may be performed using one or more generally known data compression algorithms without departing from the scope of the present invention. At step 220, the one or more computing devices record an indication that the compressed data page includes free storage space.
  • Accordingly, in various exemplary embodiments of the present invention a data page may be identified during the database system run time for compression during database system idle time, locked during the database system idle time to prevent reading from and/or writing into the data page by a user during a compression operation (e.g., by a user application), compressed, and identified as including freed storage space that may be used to store additional data during a future database system run time. Thus, the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild command combination) is obviated. Furthermore, during compression, data pages other than a data page being compressed within the corresponding table may be accessible to the database server 102, as the table is not locked for compression and reorganization (as is the case during execution of the reorg+rebuild command combination).
  • FIG. 3 is a flowchart 300 illustrating a method according to another exemplary embodiment of the present invention. At step 305, one or more computing devices of a database system, which may be embodied in database server 102 illustrated in FIG. 1, determine if a current data page (i.e., a data page currently used by the one or more computing devices to store user data), such as data page 102 illustrated in FIG. 1, is fall and may be set for compression during a subsequent database system idle mode (e.g., as part of an idle mode background housekeeping process). If the current data page is full, at step 310 the one or more computing devices store an indication that the current data page is full. The indication may be in the form of an identifier associated with the current data page or the position of a field in a data structure, but the present invention is not so limited, and a person having ordinary skill in the art will recognize that there may be various ways to store this information without departing from the scope of the present teachings.
  • At step 315, the one or more computing devices determine if the database server 102 is in idle mode. If the database server 102 is in idle mode, in step 320 the one or more computing devices lock the data page based on the indication that the data page is full as set forth in step 310. At step 325, the one or more computing devices compress some or all of the data stored in data page 110. At step 330, the one or more computing devices record an indication that the compressed data page includes free storage space and clear the indication that the data page is full. The indication may be in the form of an identifier associated with the compressed data page or the position of a field in a data structure, but the present invention is not so limited, and a person having ordinary skill in the art will recognize that there may be various ways for store this information without departing from the scope of the present teachings.
  • At step 335, the one or more computing devices determine if the database server 102 is in run time mode (i.e., the database server can read and/or write data to the data pages of the database system). If the database server is in run time mode and the database server needs to store data in a new data page (i.e., a current data page is full), in step 340 the one or more computing devices store new user data in the recently-compressed data page based on the indication that the data page 102 includes free space.
  • Accordingly, in various exemplary embodiments of the present invention, a data page that becomes full may be identified during the database system run time, locked during the database system idle time to prevent reading from and/or writing into the data page by a user application during a compression operation, compressed, and identified as including freed storage space that may be used to store additional data during a subsequent database system run time. During a database system run time, when a current page is full and more space is necessary to store new user data, the database server determines if a previously used and compressed data page includes free space. If a previously used and compressed data page includes free space, the new user data is stored in the free space. If, on the other hand, no previously used data page includes free space (e.g., no compression has taken place or spaced freed by a previous compression has been already used) then a new data page is allocated to store the new user data.
  • Furthermore, as noted above with respect to FIG. 2, a full data page may be compressed during idle time and the space freed by the compression may be used during a subsequent run time without the need for a reorganization of the data pages within the corresponding table (as in, for example, operation of a reorg+rebuild command combination). In addition, during compression, data pages other than a data page being compressed within the corresponding table may be accessible to the database server 102, as the table is not locked for compression and reorganization (as is the case in the reorg rebuild command combination).
  • FIG. 4 is a diagram 400 illustrating the relationship of some of the aforementioned exemplary memory management components, in accordance with an exemplary embodiment of the present invention. As shown on the left portion of FIG. 4, a database 402 has a hash table 404 across a plurality of partitions 406.
  • As shown in the center portion of FIG. 4, each partition includes a partition descriptor (PDES) 408 for describing the contents of its corresponding partition and one or more allocation units 410, which are shown in FIG. 4 as having an exemplary 256 pages each. Each allocation unit 410, shown on the right portion of FIG. 5, has an allocation page 412 used to indicate which pages 414 of the allocation unit are free or used. The allocation unit has a plurality of data pages 414 over a contiguous storage space.
  • In various exemplary embodiments of the present invention, an array of page identifiers (PID_CMP_ARRAY, is included in the PDES of each partition. As illustrated in FIG. 2 and FIG. 3, during a database system idle time, pages are compressed and a corresponding indication is stored. The corresponding indication may be stored in PID_CMP_ARRAY as, for example, a page identifier or a bit in a position of the array corresponding to the page identifier for the compressed page, without departing from the scope of the present invention. During the database system run time, when a current page is full and more space is necessary to store new user data, the database server determines if a previously used and compressed data page includes free space by reading PID_CMP_ARRAY. If PID_CMP_ARRAY indicates that a previously used and compressed data page includes free space, a data page identified in PID_CMP_ARRAY becomes the current data page, the data page identifier is cleared from PID_CMP_ARRAY, and the new user data is stored in the free space of the new current data page (i.e., data page identified in PID_CMP_ARRAY). If, on the other hand, no previously used data page includes free space (i.e., PID_CMP_ARRAY is empty) then a new data page is allocated to store the new user data.
  • Example Computer System Implementation
  • Various aspects of the present invention can be implemented by software, firmware, hardware, or a combination thereof. FIG. 5 illustrates an exemplary computer system 500 in which exemplary embodiments of the present invention, or portions thereof, can be implemented as computer-readable code. For example, the methods illustrated by flowcharts 200 of FIGS. 2 and 300 of FIG. 3, can be implemented in system 500. Various embodiments of the invention are described in terms of this example computer system 500. After reading this description, it will become apparent to a person skilled in the relevant art how to implement the invention using other computer systems and/or computer architectures.
  • Computer system 500 includes one or more processors, such as processor 504. Processor 504 can be a special purpose or a general purpose processor. Processor 504 is connected to a communication infrastructure 506 (for example, a bus or network).
  • Computer system 500 also includes a main memory 508, preferably random access memory (RAM), and may also include a secondary memory 510. Secondary memory 510 may include, for example, a hard disk drive 512, a removable storage drive 514, and/or a memory stick. Removable storage drive 514 may comprise a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, or the like. The removable storage drive 514 reads from and/or writes to a removable storage unit 518 in a well-known manner. Removable storage unit 518 may comprise a floppy disk, magnetic tape, optical disk, etc. that is read by and written to by removable storage drive 514. As will be appreciated by persons skilled in the relevant art(s), removable storage unit 518 includes a computer usable storage medium having stored therein computer software and/or data.
  • In alternative implementations, secondary memory 510 may include other similar means for allowing computer programs or other instructions to be loaded into computer system 500. Such means may include, for example, a removable storage unit 522 and an interface 520. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 522 and interfaces 520 that allow software and data to be transferred from the removable storage unit 522 to computer system 500.
  • Computer system 500 may also include a communications interface 524. Communications interface 524 allows software and data to he transferred between computer system 500 and external devices. Communications interface 524 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, or the like. Software and data transferred via communications interface 524 are in the form of signals that may be electronic, electromagnetic, optical, or other signals capable of being received by communications interface 524. These signals are provided to communications interface 524 via a communications path 526. Communications path 526 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link or other communications channels.
  • In this document, the terms “computer program medium” and “computer usable medium” are used to generally refer to media such as removable storage unit 518, removable storage unit 522, and a hard disk installed in hard disk drive 512. Signals carried over communications path 526 can also embody the logic described herein. Computer program medium and computer usable medium can also refer to memories, such as main memory 508 and secondary memory 510, which can be memory semiconductors (e.g. DRAMs, etc.). These computer program products are means for providing software to computer system 500.
  • Computer programs (also called computer control logic) are stored in main memory 508 and/or secondary memory 510. Computer programs may also be received via communications interface 524. Such computer programs, when executed, enable computer system 500 to implement the present invention as discussed herein. In particular, the computer programs, when executed, enable processor 504 to implement the processes of the present invention, such as the steps in the methods illustrated by flowcharts 200 of FIG. 2, 300 of FIG. 3, and 400 of FIG. 4, discussed above. Accordingly, such computer programs represent controllers of the computer system 500. Where the invention is implemented using software, the software may be stored in a computer program product and loaded into computer system 500 using removable storage drive 514, interface 520, hard drive 512 or communications interface 524.
  • The invention is also directed to computer program products comprising software stored on any computer useable medium. Such software, when executed in one or more data processing device, causes a data processing device(s) to operate as described herein. Embodiments of the invention employ any computer useable or readable medium, known now or in the future. Examples of computer useable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, optical storage devices, MEMS, nanotechnological storage device, etc.), and communication mediums (e.g., wired and wireless communications networks, local area networks, wide area networks, intranets, etc.).
  • CONCLUSION
  • It is to be appreciated that the Detailed Description section, and not the Summary and Abstract sections, is intended to be used to interpret the claims. The Summary and Abstract sections may set forth one or more but not all exemplary embodiments of the present invention as contemplated by the inventor(s), and thus, are not intended to limit the present invention and the appended claims in any way.
  • The present invention has been described above with the aid of functional building blocks illustrating the implementation of specified functions and relationships thereof The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed.
  • The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the present invention. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.
  • The breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims (18)

What is claimed is:
1. A method comprising:
locking, by one or more computing devices during an idle time of a database system, access by the database system to a data page;
compressing, by the one or more computing devices during the idle time of the database system, data stored in the locked data page; and
recording, by the one or more computing devices during the idle time of the database system, an indication that the locked and compressed data page includes free storage space,
wherein the data page is one of a plurality of data pages in a data allocation unit of the database system, and
wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page.
2. The method of claim 1, further comprising:
unlocking, by the one or more computing devices, access to the locked and compressed data page; and
storing, by the one or more computing devices during a run time of the database system, a second data in the compressed data page based on the indication that the compressed data page includes free storage space.
3. The method of claim 2, further comprising:
recording, by the one or more computing devices during the run time of the database system, an indication that the current data page for storing data is the compressed data page; and
deleting, by the one or more computing devices during the run time of the database system, the indication that the compressed data page includes free storage space.
4. The method of claim 2, further comprising:
recording, by the one or more computing devices during the run time of the database system, an indication that a second data page of the data allocation unit is full;
locking, by the one or more computing devices during a second idle time of a database system, access by the database system to the second data page; and
compressing, by the one or more computing devices during the second idle time of the database system, data stored in the locked second data page,
wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked second data page.
5. The method of claim 1, wherein
the database system comprises a memory partition for storing data pages,
the memory partition includes a partition descriptor data structure for storing information about stored data pages, and
the recording of the indication that the compressed and locked data page includes free storage space comprises recording the indication in a partition description data structure corresponding to the partition of the locked and compressed data page.
6. A method comprising:
storing, by one or more computing devices during a run time of a database system, data in a data page that includes a compressed data based on an indication that the data page includes space freed by a compression of data.
7. The method of claim 6, further comprising:
allocating, by the one or more computing devices during the run time of the database system, a second data page for storing data based on an indication that no previously-allocated and previously-compressed data page includes space freed by a compression of data.
8. The method of claim 7, further comprising:
recording, by the one or more computing devices during the run time of the database system, an indication that the second data page is full;
compressing, by the one or more computing devices during an idle time of the database system subsequent to the run time of the database system, data stored in the second data page; and
recording, by the one or more computing devices, an indication that the second data page includes space freed by a compression of data.
9. A database system comprising:
a memory; and
at least one processor coupled to the memory and configured to:
lock, during an idle time of the database system, access to a data page;
compress, during the idle time of the database system, data stored in the locked data page; and
record, during the idle time of the database system, an indication that the locked and compressed data page includes free storage space,
wherein the data page is one of a plurality of data pages in a data allocation unit of the database system, and
wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page.
10. The database system of claim 9, the at least one processor further configured to:
unlock access to the locked and compressed data page; and
store, during a run time of the database system, a second data in the compressed data page based on the indication that the compressed data page includes free storage space.
11. The database system of claim 10, the at least one processor further configured to:
record, during the run time of the database system, an indication that the current data page for storing data is the compressed data page; and
delete, during the run time of the database system, the indication that the compressed data page includes free storage space.
12. The database system of claim 10, the at least one processor further configured to:
record, during the run time of the database system, an indication that a second data page of the data allocation unit is full;
lock, during a second idle time of a database system, access by the database system to the second data page; and
compress, during the second idle time of the database system, data stored in the locked second data page,
wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked second data page.
13. The database system of claim 9, the at least one processor further configured to:
store data pages according to a memory partition, wherein the memory partition includes a partition descriptor data structure for storing information about stored data pages; and
record the indication that the compressed and locked data page includes free storage by recording the indication in a partition description data structure corresponding to the partition of the locked and compressed data page.
14. A computer-readable storage device having stored thereon instructions, execution of which, by a computing device, cause the computing device to perform operations comprising:
locking, by one or more computing devices during an idle time of a database system, access by the database system to a data page;
compressing, by the one or more computing devices during the idle time of the database system, data stored in the locked data page; and
recording, by the one or more computing devices during the idle time of the database system, an indication that the locked and compressed data page includes free storage space,
wherein the data page is one of a plurality of data pages in a data allocation unit of the database system, and
wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked data page.
15. The computer-readable storage device of claim 14, the operations further comprising:
unlocking, by the one or more computing devices, access to the locked and compressed data page; and
storing, by the one or more computing devices during a run time of the database system, a second data in the compressed data page based on the indication that the compressed data page includes free storage space.
16. The computer-readable storage device of claim 15, the operations further comprising:
recording, by the one or more computing devices during the run time of the database system, an indication that the current data page for storing data is the compressed data page; and
deleting, by the one or more computing devices during the run time of the database system, the indication that the compressed data page includes free storage space.
17. The computer-readable storage device of claim 15, the operations further comprising:
recording, by the one or more computing devices during the run time of the database system, an indication that a second data page of the data allocation unit is full;
locking, by the one or more computing devices during a second idle time of a database system, access by the database system to the second data page; and
compressing, by the one or more computing devices during the second idle time of the database system, data stored in the locked second data page,
wherein unlocked data pages of the data allocation unit are accessible by the database system during the compressing of the data stored in the locked second data page.
18. The computer-readable storage device of claim 14, wherein
the database system comprises a memory partition for storing data pages,
the memory partition includes a partition descriptor data structure for storing information about stored data pages, and
the recording of the indication that the compressed and locked data page includes free storage space comprises recording the indication in a partition description data structure corresponding to the partition of the locked and compressed data page.
US13/730,207 2012-12-28 2012-12-28 Method and system to avoid space bloating during run-time compression Active 2033-02-21 US8965857B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/730,207 US8965857B2 (en) 2012-12-28 2012-12-28 Method and system to avoid space bloating during run-time compression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/730,207 US8965857B2 (en) 2012-12-28 2012-12-28 Method and system to avoid space bloating during run-time compression

Publications (2)

Publication Number Publication Date
US20140188821A1 true US20140188821A1 (en) 2014-07-03
US8965857B2 US8965857B2 (en) 2015-02-24

Family

ID=51018380

Family Applications (1)

Application Number Title Priority Date Filing Date
US13/730,207 Active 2033-02-21 US8965857B2 (en) 2012-12-28 2012-12-28 Method and system to avoid space bloating during run-time compression

Country Status (1)

Country Link
US (1) US8965857B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181938A (en) * 2019-07-05 2021-01-05 杭州海康威视数字技术股份有限公司 Database cleaning method, device and computer readable storage medium
CN114461405A (en) * 2022-04-01 2022-05-10 荣耀终端有限公司 Storage method and related device for locking page in memory

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020073287A1 (en) * 2000-12-12 2002-06-13 International Business Machines Corporation Method and apparatus for implementing locking of non-data page operations
US6985950B1 (en) * 2001-03-06 2006-01-10 Microsoft Corporation System for creating a space-efficient document categorizer for training and testing of automatic categorization engines

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020073287A1 (en) * 2000-12-12 2002-06-13 International Business Machines Corporation Method and apparatus for implementing locking of non-data page operations
US6985950B1 (en) * 2001-03-06 2006-01-10 Microsoft Corporation System for creating a space-efficient document categorizer for training and testing of automatic categorization engines

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112181938A (en) * 2019-07-05 2021-01-05 杭州海康威视数字技术股份有限公司 Database cleaning method, device and computer readable storage medium
CN114461405A (en) * 2022-04-01 2022-05-10 荣耀终端有限公司 Storage method and related device for locking page in memory

Also Published As

Publication number Publication date
US8965857B2 (en) 2015-02-24

Similar Documents

Publication Publication Date Title
US9659050B2 (en) Delta store giving row-level versioning semantics to a non-row-level versioning underlying store
US11048757B2 (en) Cuckoo tree with duplicate key support
US10839016B2 (en) Storing metadata in a cuckoo tree
US7933938B2 (en) File storage system, file storing method and file searching method therein
US20160147806A1 (en) Versioned bloom filter
WO2017185579A1 (en) Method and apparatus for data storage
JP2002530776A (en) Apparatus and method for concurrent DBMS table manipulation
CN108139972B (en) Method and apparatus for managing memory fragmentation in hardware assisted data compression
WO2017050064A1 (en) Memory management method and device for shared memory database
US20190384743A1 (en) Method, device and computer readable storage medium for deleting snapshot data
CN114936188A (en) Data processing method and device, electronic equipment and storage medium
US20160292257A1 (en) Providing multiple concurrent transactions on a single database schema using a single concurrent transaction database infrastructure
CN107092624A (en) Date storage method, apparatus and system
US8595430B2 (en) Managing a virtual tape library domain and providing ownership of scratch erased volumes to VTL nodes
CN109144403B (en) Method and equipment for switching cloud disk modes
US8965857B2 (en) Method and system to avoid space bloating during run-time compression
US10311026B2 (en) Compressed data layout for optimizing data transactions
US20130080481A1 (en) Extreme large space allocation
US11055184B2 (en) In-place garbage collection of a sharded, replicated distributed state machine based on supersedable operations
CN114297196A (en) Metadata storage method and device, electronic equipment and storage medium
US10877881B2 (en) In-place garbage collection of a sharded, replicated distributed state machine based on mergeable operations
EP3436990B1 (en) Systems and methods for enabling modifications of multiple data objects within a file system volume
EP4321981A1 (en) Data processing method and apparatus
US20190171635A1 (en) System and Method for Creating Storage Containers in a Data Storage System
US8819355B2 (en) Information processing apparatus, information processing method, and program

Legal Events

Date Code Title Description
AS Assignment

Owner name: SAP AG, GERMANY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHOU, PANFENG;TERADA, KATSUNORI;WANG, YANHONG;SIGNING DATES FROM 20130412 TO 20130514;REEL/FRAME:030502/0530

AS Assignment

Owner name: SYBASE, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SAP AG;REEL/FRAME:031172/0989

Effective date: 20130905

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8