<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Wojno: Theoretical Sets</title>
    <link>http://christopher.wojno.com/articles/2008/05/31/theoretical-sets</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Exploration through Code</description>
    <item>
      <title>Theoretical Sets</title>
      <description>&lt;p&gt;Recently, I have been thinking about a particular data structure that I&amp;#8217;ve never seen before. For those who do not know, I am an avid C/C++ programmer and have recently discovered Ruby and Rails. Thus, I am familiar with the &lt;span class="caps"&gt;STL&lt;/span&gt; that is so closely related to C++. One of my favorite containers is the set. Sets operate just like you&amp;#8217;d expect of a mathematical set (thus the name). It contains a &lt;em&gt;set&lt;/em&gt; of values.&lt;/p&gt;


	&lt;p&gt;For example (an assume I have a magical print function to inspect the sets on the terminal): &lt;div id="figure1"&gt;&lt;pre class="code"&gt;set&amp;lt;int&amp;gt; my_set;
my_set.insert( 7 );
my_set.insert( 5 );
magic_set_printer( my_set );&lt;/pre&gt;&lt;a name="figure1"&gt;&lt;/a&gt;Figure 1: &lt;span class="caps"&gt;A C&lt;/span&gt;++ &lt;span class="caps"&gt;STL&lt;/span&gt; set with two values.&lt;/div&gt;Will produce: &lt;pre class="code"&gt;{5,7}&lt;/pre&gt; The very neat part about sets is the fact that one can operate using set functions over two or more sets. My favorite is set_difference, but set_union and set_intersection are also highly useful.&lt;/p&gt;


	&lt;p&gt;As you can see, they&amp;#8217;re good for keeping small sets of things.&lt;/p&gt;


	&lt;h1&gt;So What is a Theoretical Set?&lt;/h1&gt;


	&lt;p&gt;The problem with sets as they appear in C++: they require storage space for every item stored in them. This isn&amp;#8217;t a big problem if you have only a few elements in a set, but what&amp;#8217;s the fun in that? Suppose you have the same set above in &lt;a href="#figure1"&gt;Figure 1&lt;/a&gt; but decide to add the value 6. Under a normal set, your set will contain the values 5, 6, and 7. Extending the argument, if your set contains a contiguous, uinterrupted set from, say, 5 to 1,000, your set will contain, yes, you guessed it: 995 elements. Clearly, there&amp;#8217;s room for some improvement if you expect contiguous values.&lt;/p&gt;


	&lt;p&gt;Enter the theoretical set.&lt;/p&gt;


	&lt;p&gt;It is just like a normal &lt;span class="caps"&gt;STL&lt;/span&gt; set, however, it will take shortcuts on memory by combining values into ranges. For example, the set: {5,6,7} becomes: {5..7}. The original set will use 3 values, the theoretical set will use 2. As you can see, if you have a large set from 5..1000, it is better to use the theoretical set as search times and memory will be saved. While I&amp;#8217;m not sure, exactly, at what point it becomes more economical to use theoretical sets (and again, it depends on use), if you expect to have contiguous ranges in your sets, I suspect much improved performance.&lt;/p&gt;


	&lt;p&gt;Unfortunately, operating on ranges is harder than operating on single values. Thus, there is a relatively significant performance penalty when calculating intersections, differences, etc. Though, in the long-run, the penalty is not noticeable.&lt;/p&gt;


	&lt;h1&gt;A Ruby Theoretical Set&lt;/h1&gt;


	&lt;p&gt;I had originally envisioned such a construct to be written for the benefit of C++ coders. However, due to the number constraints, I was dissuaded. Ideally, one should be able to use a theoretical set on any type of number. While templates offer some solution, I also wanted more: complex types.&lt;/p&gt;


	&lt;p&gt;Should one not be able to arrange a set of contiguous dates? Maybe not contiguous to the second, but perhaps weekends? A set over a range of objects. Makes you tingle a bit, doesn&amp;#8217;t it?&lt;/p&gt;


	&lt;p&gt;The nice thing about loosely typed languages, is that you can pass just about anything into them without worrying much about some of the details. So that solves the typing problems with C++ and makes for a good set!&lt;/p&gt;


	&lt;p&gt;I have not actually implemented a straight-up Ruby Theoretical Set. I have created on that uses ActiveRecord, but that&amp;#8217;s a post for another time.&lt;/p&gt;</description>
      <pubDate>Sat, 31 May 2008 12:46:00 -0700</pubDate>
      <guid isPermaLink="false">urn:uuid:83a55941-eedf-4195-97bf-cfb628236210</guid>
      <author>Christopher Wojno</author>
      <link>http://christopher.wojno.com/articles/2008/05/31/theoretical-sets</link>
      <category>Theory</category>
      <category>theoretical</category>
      <category>set</category>
      <category>c</category>
      <category>stl</category>
    </item>
  </channel>
</rss>
