SBM MODSCRIPT, PART 16 - BASE64DECODE

I recently discovered that I would need a way to do base64 decoding for a ModScript I was writing. This can be tricky, as the output could be a binary value with embedded zeros. You could certainly do this with the output as a Vector with each entry a uint8_t (unsigned byte). However, in my use case, I knew that the data was text and could be represented as a string. As such, I wrote the following:

add_global_const("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/", "CONST_BASE64TABLE");
def Base64DecodeAsText( string input ) { // assumes output is valid text (not binary)   
  var sOut = "";
   
  var buf = [uint8_t(),uint8_t(), uint8_t(), uint8_t()];
  var encoded = int( input.size() );
  var count = 3 * ( encoded / 4 );
  var i = 0;
  var j = 0;
  
  while ( sOut.size() < count ) {
    // Get the next group of four characters
    //    'xx==' decodes to  8 bits
    //    'xxx=' decodes to 16 bits
    //    'xxxx' decodes to 24 bits
    for_each( buf, fun( entry ){ entry = 0; } ); // zero out buffer
    var stop = min( encoded - i + 1, 4 );
    for ( j = 0; j < stop; ++j ) {
      if ( input[i] == '=' ) {
        // '=' indicates less than 24 bits
        buf[j] = 0;
        --j;
        break;
      }

      // find the index_of inside CONST_BASE64TABLE for our value
      buf[j] = fun( s, c ) {
        for ( var i = 0; i < s.size(); ++i ) {
          if ( s[i] == c ) {
	        return i;
	      }
        }
        return string_npos;
      }( CONST_BASE64TABLE, input[i] );
      ++i;
    }
	
    // Assign value to output buffer
    sOut += char(buf[0] << 2 | buf[1] >> 4);
    if ( sOut.size() == count || j == 1 ) {
      break;
    }
    
    sOut += char(buf[1] << 4 | buf[2] >> 2);
    if ( sOut.size() == count || j == 2 ) {
      break;
    }
	
    sOut += char(buf[2] << 6 | buf[3]);
  }
  
  return sOut;
}

The function above iterates the input string contents and uses base64 to create an decoded output string. Notice that the "buf" variable is a Vector of 4 unsigned, 8 bit integers. As we are going to use bit shifting in order to decode the data, it is important to use unsigned byte data to ensure the expected bit-shift result. We find the index of the character in CONST_BASE64TABLE to find the data-representation we are looking for, then use bit shifting to convert the buf value to text. The result is the original text after processing the base64 algorithm. A possible use case for this might be in decoding HTTP headers from a REST call. 

SBM ModScript - Table of Contents

SLAs with SBM Notification Engine
SBM MODSCRIPT, PART 17 - File Fields

Related Posts

Comments

 
No comments yet

Recent Tweets