This blog has moved here.

Tuesday, April 24, 2012

A simple way to configure a DNS for your Oracle playground RAC

Because we are talking about a test/playground environment, the first idea which comes to mind is virtualization. VirtualBox, Oracle VM, VMWare and, of course, others, all may be used to create a RAC playground, provided that you have a computer powerful enough to support this kind of setup.

But, despite the advantage of virtualization, you also have to think of how the new virtualized RAC will talk to the external world and how you're going to support the new SCAN feature available in Oracle 11g. Yeap, you guessed right! I'm talking about the integration with your DNS server. I will not cover here the GNS feature but just the SCAN address, which must be defined in your DNS to three different IPs, in a round-robin fashion.

Hmmm, damn it! My virtualized RAC is on my home desktop, I have a stupid wireless rooter without any possibility to define DNS entries there, and installing a new machine just for a DNS server seems to be a awfully heavyweight solution. In addition, using BIND, the well know DNS server, is not very appealing for me because I'm not a network administrator and I simply don't get it. But, if you plan to stick with BIND you may find an example here. Of course, being a playground, you may rely to the /etc/hosts file but you'll receive warnings on install and it doesn't feel right, does it?

Ok, introducing MaraDNS. They claim that: "People like MaraDNS because it's small, lightweight, easy to set up, and remarkably secure. It's also cross platform -- the program runs both in Windows and in UNIX clones". Hey, this is what I'm looking for. Great! So, comming back to our environment.

In my case, VirtualBox is installed on a Windows 7 box and the RAC nodes are virtualized with OL6. The idea is that I will have MaraDNS installed on my Windows 7 and it will service my virtualized RAC nodes. The following picture should shed some light.

Now let's configure MaraDNS to act like we want. The Windows version doesn't have any installer. You'll simply extract the archive and you'll have everything you want. Then, we need to edit "mararc" config file. I ended up with:

ipv4_bind_addresses = "127.0.0.1"
timestamp_type = 2
hide_disclaimer = "YES"
random_seed_file = "secret.txt"
csv2 = {}
csv2["altik.ro."] = "db.lan.txt"

I basically say:
  1. listen on localhost for requests (strange, isn't it? Be patient!)
  2. set the timestamp_type, but don't bother
  3. hide the disclaimer when launching the server. I like it quiet
  4. in secret.txt file I have some random text to secure my server, but don't bother too much
  5. everything with a trailing "altik.ro" (the domain) is mapped in the "db.lan.txt" file. Of course, you have to pick your domain. It doesn't need to be a real one, but if it exists it becomes unreachable.
Okey, great! Now let me see the "db.lan.txt" file:

ping.altik.ro.            FQDN4 192.168.1.100
pong.altik.ro.            FQDN4 192.168.1.101
ping-vip.altik.ro.        FQDN4 192.168.1.200
pong-vip.altik.ro.        FQDN4 192.168.1.201
poc-scan.altik.ro.        FQDN4 192.168.1.150
poc-scan.altik.ro.        FQDN4 192.168.1.151
poc-scan.altik.ro.        FQDN4 192.168.1.152

Mmm, that's all? Pretty much! Go ahead, start the server using the provided "bat" file and execute some nslookup commands. For example, "nslookup ping.altik.ro 127.0.0.1". Does it work? Test the SCAN as well. Every nslookup should return another IP (round robin algorithm).

Now, we have two problems:
  1. our DNS server is listening on localhost and there's no way to make requests from a remote node
  2. I'd also like to have internet on my RAC nodes in order to use YUM and other tasks which need Internet
No problem! In MaraDNS folder there is a sub-folder called "Deadwood...", which contain a so-called DNS recursive server. It doesn't allow to define your DNS entries but you can tell it to ask another DNS server for its entries and to cache them. Hey, that's exactly what we want. Ask the DNS server of my ISP for its entries and my local MaraDNS. Let's have a look at the config file (dwood3rc.txt):

root_servers = {}

# My MaraDNS
root_servers["192.in-addr.arpa."] = "127.0.0.1"
root_servers["altik.ro."] = "127.0.0.1"

# DNS server from my ISP
root_servers["."] = "213.154.124.1,"
root_servers["."] += "193.231.252.1"
# The IP this program has
bind_address="192.168.1.10"

# The IPs allowed to connect and use the cache
recursive_acl = "192.168.1.0/24"

# The file containing a hard-to-guess secret
random_seed_file = "secret.txt"

# This is the file Deadwood uses to read the cache to and from disk
cache_file = "dw_cache_bin"
# By default, for security reasons, Deadwood does not allow IPs in the
# 192.168.x.x, 172.[16-31].x.x, 10.x.x.x, 127.x.x.x, 169.254.x.x,
# 224.x.x.x, or 0.0.x.x range.  If using Deadwood to resolve names
# on an internal network, uncomment the following line:
filter_rfc1918 = 0

The above configuration is quite self explanatory. Everyting which refers to my local domain will be served from MaraDNS and everything else will be fetched from a real DNS server provided by my ISP. The "in-adr.arpa" entry is needed in order to support reverse DNS. Pay attention that this DNS server will listen on a real address, the one of my Windows 7 host.

Okay, great! Install Deadwood (as a service), and then test it with nslookup but instead of 127.0.0.1 use 192.168.1.10. If everything is working then it's just a matter of configuring your virtualized nodes to use 192.168.1.10 as a DNS server.

Friday, March 30, 2012

Escape HTML Entities Except for Some Well Known Tags

We are currently developing an APEX application and one of the requirement is to let users enter formatted text data into the system via a CKEDITOR control.


As it may be noticed above, not all the formatting options are enabled but just the possibility to apply bold, italic and underlined styling and, in addition, to create ordered and unordered lists. This means that the data we store into the database has html tags within and it must be echoed back as it is into the browser when the page which renders that data is about to be displayed.

Of course, the first thing you must take into consideration is the security issue: we don't want to render our pages with unsecure tags therefore we must sanitize the input. What we want is to remove all tags which are not into our whitelist. It seems like a very simple problem but the solution is not so obvious.

Our first attempt was to use a smart regular expression to do the job. It turned out that the expression was simply too complicated and very difficult to maintain. Put into the mix the fact that the Oracle regexp engine doesn't have all the goodies you'd expect if you are a Perl or Ruby developer and you quickly end up in disapointment.

The actual solution we implemented (which seems to work pretty well) is to parse the html content and to remove the offending tags. In order to parse we used just an ordinary xmltype. Let's see an example:
select regexp_replace( 
        deletexml(
          xmltype('<post>' ||  
                  'You have <b>been</b> <u>hacked</u> ' ||
                  'by <br/><a href="http://www.youtube.com/watch?v=CjPfBxJboJI">Gurin Remus Adrian</a>.' ||
                  '</post>'),
          '/post/*[(name() != "b") and (name() != "u")]'),
        '(^<post>)|(</post>$)',
        '') 
  from dual;

CLEAN_HTML                            
--------------------------------------
You have <b>been</b><u>hacked</u> by .
Great! Having that it's quite easy to write a PL/SQL function to embed the above logic:
  function strip_html_tags(html_in in clob, except_tags_in in varchar2 := null) return clob as
    l_clean_html clob;
    l_filter varchar2(32767) := '/post';
    l_predicates varchar2(32767);
    l_array apex_application_global.vc_arr2;
  begin
    if except_tags_in is null then
      l_filter := l_filter || '/*[(name() != "post")]';
    else
      l_array := apex_util.string_to_table(except_tags_in, ',');
      for i in 1..l_array.count loop
        if l_array(i) is not null then
          l_predicates := l_predicates || case when i > 1 then ' and ' else '' end || '(name() != "' || l_array(i) || '")';
        end if;
      end loop;
      if l_predicates is not null then
       l_filter := l_filter || '/*[' || l_predicates || ']';
      end if;
    end if;
    select regexp_replace( 
              deletexml(
                xmltype('<post>' || html_in || '</post>'),
                l_filter),
              '(^<post>)|(</post>$)',
           '') into l_clean_html
      from dual;
    return l_clean_html;
  end;
The only problem we had using this approach was with German characters or other HTML character entities. For example, by default CKEDIT sends the German umlaut character as &uuml; and, apparently, the Oracle XML parser is not smart enough to recognize this notation. It simply fails with LPX-00118 error. In fact, if you think about it, it's not so unusual given that we are talking about a XML parser, not an HTML one. However, if you encode umlaut with its numerical notation which is &#252; then everything is working like a charm. This boils down to properly configure CKEDITOR in order to force it to encode this characters with their numerical notation. The key property is:
entities_processNumerical": 'force'
Mission accomplished!