Home > mailing lists

Re: [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

From	Alexandr Popov
Subject	Re: [WIP] Effective storage of duplicates in B-tree index.
Date	March 23, 2016 15:30:19
Msg-id	[email protected] Whole thread Raw
In response to	Re: [WIP] Effective storage of duplicates in B-tree index. (Anastasia Lubennikova <[email protected]>)
List	pgsql-hackers

Tree view

<br /><br /><div class="moz-cite-prefix">On 18.03.2016 20:19, Anastasia Lubennikova wrote:<br /></div><blockquote
cite="mid:[email protected]"type="cite">Please, find the new version of the patch attached. Now it has
WALfunctionality. <br /><br /> Detailed description of the feature you can find in README draft <a
class="moz-txt-link-freetext"href="https://goo.gl/50O8Q0">https://goo.gl/50O8Q0</a><br /><br /> This patch is pretty
complicated,so I ask everyone, who interested in this feature, <br /> to help with reviewing and testing it. I will be
gratefulfor any feedback. <br /> But please, don't complain about code style, it is still work in progress. <br /><br
/>Next things I'm going to do: <br /> 1. More debugging and testing. I'm going to attach in next message couple of sql
scriptsfor testing. <br /> 2. Fix NULLs processing <br /> 3. Add a flag into pg_index, that allows to enable/disable
compressionfor each particular index. <br /> 4. Recheck locking considerations. I tried to write code as less invasive
aspossible, but we need to make sure that algorithm is still correct. <br /> 5. Change BTMaxItemSize <br /> 6. Bring
backmicrovacuum functionality. <br /><br /></blockquote><br /><br /> Hi, hackers.<br /><br /> It's my first review, so
donot be strict to me.<br /><br /> I have tested this patch on the next table:<br /> create table message<br />    
(<br/>         id        serial,<br />         usr_id        integer,<br />         text        text<br />     );<br />
CREATEINDEX message_usr_id ON message (usr_id);<br /> The table has 10000000 records.<br /><br /> I found the
following:<br/> The less unique keys the less size of the table.<br /><br /> Next 2 tablas demonstrates it. <br /> New
B-tree<br /> Count of unique keys (usr_id), index“s size , time of creation<br /> 10000000    ;"214 MB"   
;"00:00:34.193441"<br/> 3333333      ;"214 MB"    ;"00:00:45.731173"<br /> 2000000      ;"129 MB"   
;"00:00:41.445876"<br/> 1000000      ;"129 MB"    ;"00:00:38.455616"<br /> 100000        ;"86 MB"     
;"00:00:40.887626"<br/> 10000          ;"79 MB"      ;"00:00:47.199774"<br /><br /> Old B-tree <br /> Count of unique
keys(usr_id), index“s size , time of creation<br /> 10000000    ;"214 MB"    ;"00:00:35.043677"<br /> 3333333     
;"286MB"    ;"00:00:40.922845"<br /> 2000000      ;"300 MB"    ;"00:00:46.454846"<br /> 1000000      ;"278 MB"   
;"00:00:42.323525"<br/> 100000        ;"287 MB"    ;"00:00:47.438132"<br /> 10000          ;"280 MB"   
;"00:01:00.307873"<br/><br /> I inserted data  randomly and sequentially, it did not influence the index's size.<br />
Timeof select, insert and update random rows is not changed. It is great, but certainly it needs some more detailed
study.<br/>  <br /> Alexander Popov<br /> Postgres Professional: <a class="moz-txt-link-freetext"
href="http://www.postgrespro.com">http://www.postgrespro.com</a><br/> The Russian Postgres Company <br /><br /><br />

pgsql-hackers by date:

From: Robert Haas
Date: 23 March 2016, 15:02:16
Subject: Re: Patch: fix lock contention for HASHHDR.mutex

From: Yury Zhuravlev
Date: 23 March 2016, 15:38:28
Subject: Re: NOT EXIST for PREPARE

Re: [WIP] Effective storage of duplicates in B-tree index. - Mailing list pgsql-hackers

Previous

Next