Bluemini.comBluemini.com

CF_This_Helps

Writing large output files with ColdFusion

posted: 03 Aug 2008

Have you ever tried to generate a large file in ColdFusion? It's painful to say the least.

There are basically 5 common CF ways to get data into a file. Some of these methods are faster than others, and some are memory hogs. What we want is a combination of fast speed and low memory usage. And as you'll see, we have a winner.

  1. CFFILE append for each record of output. This method is very easy to implement, but killer on the server. Don't do it.

  2. Build string and then CFFILE append less frequently. This is a popular work around to the speed problems found in CFFILE append. But it can be very memory intensive, and slow. Do you know how long the string may get? If you�"re looping over data from a database, XML file, etc, you may not know ahead of time how much data you have to process. If you want to insure you don't exceed some predetermined string size limit, then you need to program that logic into your application. ColdFusion strings are immutable, meaning they cannot change. When you concatenate two strings, CF creates an entirely new string for you. That's fine if you're performing a few dozen append operations, but at some point, the append operations will come to a crawl and continue to get worse. And server memory will dry up fast. Assuming you spent a lot of time checking string lengths and writing data at pre-determined limits, this option can be OK. But it's a lot of work.

  3. CFSAVECONTENT is another popular way of saving large content. It's fairly quick, easy to implement, and works fine for small to mid-size data batches. But watch out when you start generating larger files. It's a memory hog.

  4. ColdFusion array can give you one of the best performances (speed wise) of the options listed here. Basically, instead of building a string using concatenation (option 2), you add a string to an array. When you're done adding strings to the array, just call listToArray() to generate your output. It does have its limits however. With large data sets, it will slow down considerably. Also, as in Option 2, you should put some sort of size limit checking on your array and perform write operations when the limit is reached.

  5. Using Java classes FileOutputStream, OutputStreamWriter, and BufferedWriter, you can implement a very fast file writer that is also very light on the server memory. When the BufferedWriter class is initialized, you can tell it how much of a cache you would like. Default is 8k. You can set it higher it you know you�"re doing a lot of output. I found that a 32k or 64k buffer can help speed things up, but you don�"t see much improvement going much higher. With Java, you open the file, perform all the writes you need, and then close the file. Java determines when the buffer is full and when an actual I/O needs to occur. You just get busy writing!

I'll kick off this blog by presenting a tried and true copy of my FileWriter.cfc. It wraps all of the functionality you need to write files. It even supports unicode. In fact, if you specify an encoding type of UTF-8, or UTF-16, the FileWriter will write a byte order mark (BOM) on the file for you.

You don't need to know Java to use the FileWriter.cfc, in fact, here's how easy it is to implement:

<cfscript>
myOutFile = createObject('component', 'FileWriter').init(fileName, encoding, bufferSizeinK);
// loop and write
myOutFile.writeLine(variables.someDataVar);
// end loop
myOutFile.close();
</cfscript>

You can download a zipped copy of the ColdFusion Java CFC FileWriter.cfc here

Using the Java StringBuffer class in ColdFusion

posted: 03 Aug 2008

If you have ColdFusion code that requires concatenating or appending strings, you may want to look into using the Java StringBuffer class.

The StringBuffer.cfc program (listed below) acts as a wrapper for the native Java StringBuffer methods; performing a lot of the dirty work, like insuring javaCast() is performed on ALL variables passed to the Java method calls.

If your CF program is looping through data and concatenating or appending strings, you will likely see a huge performance increase by implementing the StringBuffer.cfc. However, you'll likely see no difference with small sets of data. If your code has the potential to perform more than 100 append operations, StringBuffer can speed things up.

ColdFusion's string variables are immutable, meaning they cannot change. In CF, when you append one string to another string, you get an entirely new string. As the string grows in size, each concatenation operation takes more RAM, and more CPU. At some point, your server will lock up. I haven't seen this personally, but I know a guy...

The Java StringBuffer is mutable, meaning it can change. When you append data, it really does just add to the end of an existing string. Much faster. Especially when you're working with larger data sets.

For testing, I looped through 1,000 and 10,000 iterations and compared the regular CF append, myVar = myVar & someOtherVar, with Java and the differences in speed are amazing. Actually, something of an oddity; if you have a small number of concatenations, ColdFusion is actually faster than StringBuffer. But StringBuffer shines when you have 100 or more appends. As the string get bigger, Java StringBuffer easily pulls away from ColdFusion, in speed, by a very large margin.

And it's really easy to implement; 3 commands are all you need to know.

<cfscript>
    myStr = createObject('component','StringBuffer').init();
    myStr.append(variables.someString);
    myStr.getString();
</cfscript>

It works lighting fast on string append intensive code.

An example of StringBuffer in action:

<cfscript>
for (i = 1; i LTE dbQuery.recordCount; i = i + 1) {
    if (someCondition) {
        myStr.append(dbQuery.dbColumn[i]);
    }
}
</cfscript>

Download the ColdFusion Java CFC StringBuffer.cfc and take it for a test drive. Make a simple loop and run it 1,000 times and 10,000 times, and see the difference. StringBuffer can be up to 4 times faster than CF.

If you're running Java 1.5 or newer on your ColdFusion server, you can benefit even more by using the StringBuilder.cfc. The StringBuilder.cfc program utilizes the new StringBuilder class in Java 1.5. According to the Sun Java 1.5 API:

"This class provides an API compatible with StringBuffer, but with no guarantee of synchronization. This class is designed for use as a drop-in replacement for StringBuffer in places where the string buffer was being used by a single thread (as is generally the case). Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations."

So if you're running Java 1.5 on your CF server, you may want to opt for downloading the ColdFusion Java CFC StringBuilder.cfc.

If you don't want to wait to see what other CFCs are available, you can go to my big 1 page ColdFusion site and get the list.

If you have any questions, comments, or suggestions, please let me know.

Reading large files with ColdFusion

posted: 03 Aug 2008

Reading in a large file with ColdFusion can be a little scary. When using the CFFILE read tag in CF, the entire file is read into a variable. Then it's up to you to parse through the data in the variable.

The FileReader.cfc program (listed below) utilizes the Java classes FileInputStream, InputStreamReader, and BufferedReader which enable you to read large files, line by line, with ColdFusion.

When the FileReader CFC objected is initialized, you can define the buffer size for the read-ahead. You can also specify the file encoding, such as ASCII, UTF-8, etc.

Overall, for small to medium sized files, CFFILE will probably be your best bet. But if you like the idea of being able to read in a record-at-a-time, or have the need to process large files, FileReader.cfc could come in handy.

Using the FileReader ColdFusion CFC is easy. You basically use 4 commands:

  1. createObject() � to create the FileReader.cfc object.
  2. readLine() � reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.
  3. isEOF() � returns a Boolean value whether the end of file has been reached.
  4. close() � closes the file.
Here is a simple example of FileReader in action:

<cfscript>
    inFile = createObject('component', 'FileReader').init('c:\temp\foo.txt', 'utf-8', 32768);
    inStr = inFile.readLine();

    while ( not inFile.isEOF() ) {
        inStr = inFile.readLine();
    }

    inFile.close();
</cfscript>

You can download a zipped copy of the ColdFusion Java CFC FileReader.cfc here.

You can also see other ColdFusion Java CFC programs available for download here.

Please let me if you have questions, comments, or feedback.

ColdFusion XML Database Schema and Data Export CFC

posted: 03 Aug 2008

The ExportDbToXml.cfc, listed below, generates a UTF-8 XML file containing a SQLServer database schema, or a schema and data. It currently only works for SQLServer because the ColdFusion CFC queries the information_schema views in SQLServer in order to get the database metadata. It could be easily modified to query the appropriate system tables or views for Oracle, PostgreSQL, or just about any other major DB out there.

In order to insure safe data export for XML, all textual data is wrapped with the CDATA tag. For obvious reasons, binary data is not exported to the XML file. However, the schema is still generated for binary fields, it's just the contents of the fields in the XML file are blank.

It's easy to implement, and you could also easily build a simple front end form to the ColdFusion Java CFC. The XML export CFC takes 4 parameters:

  1. dataSource � Name of the datasource
  2. outDir � the directory to place the XML file. The file is always called 'dataSource'-DbExport.xml
  3. schemaOnly � A Boolean. Do you only want the DB schema? (0 for schema and data. 1 for schema only.)
  4. tableName
    • Blank - Will give you all non-system tables in the database
    • tableName(s) - A single table name, or a comma delimited list of table names
To run the XML export CFC:

<cfscript>
    xmlXport = createObject('component','ExportDbToXml').init();
    xmlXport.export( dbName, OutDir, SchemaOnly?, [table(s)] );
</cfscript>

You can download a zipped copy of the ColdFusion Java CFC ExportDbToXml.cfc here.

The ExportDbToXml.cfc requires the ColdFusion Java CFC FileWriter.cfc in order to operate.

You can find the entire list here ColdFusion Java CFC programs.

Enjoy.

SQL Batch Processing with CF and Java

posted: 03 Aug 2008

Sometimes out of necessity, or otherwise, it's required to process data files with ColdFusion MX, and then apply the data to a database such as SQLServer, Oracle, or PostgreSQL. Depending to the file size, that could mean a lot of cfqueries to perform. Reducing the number for cfquery calls can greatly increase performance.

SqlStringBufferArray.cfc (listed below) was designed to handle large transaction sets as a batch; meaning all statements will work, or none will work. By building large strings of SQL statements, database roundtrips can be greatly reduced. SqlStringBufferArray.cfc creates a ColdFusionMX array of Java StringBuffer objects. The Java StringBuffer class is extremely fast and efficient at building large strings.

To keep things reasonable, and to not exceed potential limits (CF, SQLServer, etc), each StringBuffer object has a 64k size limit. Meaning each string buffer can contain as many SQL statements as you can fit in 64k. But you don't need to worry about checking string sizes or array positions. When the string buffer fills up in array position 1, SqlStringBufferArray.cfc automatically increments the array position and creates a new string buffer object.

As a developer, you don't need to worry about when to start a new string, or if a SQL statement will fit on the string before you need to create a new string. You simply add SQL statements, and when you're done, submit the batch.

Here's how to implement the ColdFusion Java CFC object:


<cfscript>
   sqlBuf = createObject("component","SqlStringBufferArray").init();
   looping {
      ...build SQL statement
      sqlBuf.append(sqlStatement);
   }
</cfscript>


To submit the SQL batch, perform the logic below. You basically loop over the array, submitting each 64k SQL batch to cfquery (one at a time).


<cftransaction action="begin">
   <cftry>
      <!--- loop over SQL batches and apply to DB --->
      <cfloop index="i" from="1" to="#sqlBuf.sqlArrayLength()#">
         <cfset sqlString = sqlBuf.getString(i) />
         <cfquery name="runSQL" datasource="DSN">
            #preserveSingleQuotes(sqlString)#
         </cfquery>
      </cfloop>

      <cftransaction action="commit" />

      <cfcatch type="database">
         ...error trapping here
         <cftransaction action="rollback" />
      </cfcatch>
   </cftry>
</cftransaction>


You can download a zipped copy of the ColdFusion Java CFC SqlStringBufferArray.cfc here.

You can find the entire list here ColdFusion Java CFC programs.

Tips, suggestions, feedback? Drop a comment. Thanks.

Using CFTRANSACTION in ColdFusion for Transaction

posted: 03 Aug 2008

Most databases today support transactions. Transactions are used to group multiple SQL statements into a single transaction set that result in a single known outcome. All statements will succeed, or all will be rolled back. This is important to insure database data integrity and consistency.

Let's say we have an action page that performs three SQL database operations:


  1. Update a parent table

  2. Insert a record into a child table

  3. Log the event into history table

Without transactions, it's possible that #1 and #2 succeed, but step #3 fails. In this case, your database is now in an unknown state. Some of the SQL operations worked, and some failed. Now the database integrity is gone.

If you wrap a transaction around your SQL operations (where appropriate) you can insure that all SQL statements work, or none will work.

There are proper and improper uses of transactions. The documentation on cftransaction is pathetic to say the least. Many times the lack of good documentation leads to confusion, which results in improper or ineffective uses.

If you have a single SQL statement, then there is no need for a cftransaction. Also, the cftransaction tag does not go around EACH cfquery. The cftransaction tag must surround the multiple queries that make up the transaction set.

I like to be more explicit in the use of a cftransaction. And I almost always use cftransaction in conjunction with a cftry/cfcatch construct.

Here is a simple, and proper use of cftransaction.


<cftransaction action="BEGIN">
   <cftry>
      <cfquery name="updateCustomer" datasource="#DSN#">
         UPDATE aTableName
         ...
      </cfquery>
      <cfquery name="insertOrder" datasource="#DSN#">
         INSERT INTO anOrderTableName
         ...
      </cfquery>
      <cfquery name="logEvent" datasource="#DSN#">
         INSERT INTO aLogTable
         ...
      </cfquery>

      <cftransaction action="COMMIT" />

      <cfcatch type="DATABASE">
         - Custom Error trapping here. -
         <cftransaction action="ROLLBACK" />
      </cfcatch>
   </cftry>
</cftransaction>


In the above example, the logic flows sequentially through the SQL statements. If they all succeed, the code will fall-through to the <cftransaction action="COMMIT"> tag and then we're done. However, if a DB exception is thrown during processing of any of the SQL statements, then control is passed to the <cfcatch type="DATABASE"> section were custom errors can be displayed and a <cftransaction action="ROLLBACK"> is performed.


Another interesting use of cftransaction can be used to perform "dirty reads". If you have a database that is primarily used for queries, and it receives a lot of traffic, you can see a performance improvement by keeping your database from performing un-necessary table/row locking. This results in less DB contention, less lock checking, fewer resources, and faster queries.

Simply wrap the following cftransaction tag around the query, or queries:


<cftransaction isolation="read_uncommitted">
   <cfquery name="getRecords" datasource="#DSN#">
      SELECT col1, col2, col3
      FROM aTableName
      WHERE ...
   </cfquery>
</cftransaction>


Data that has not yet been committed is referred to as "dirty". It may be rolled-back, it may be committed. We don't know yet. So by default, most major DB's will not return uncommitted data, and therefore, spend extra resources insuring only committed data is returned. Also, many databases perform what is known as anticipatory locks, meaning the table will be locked, if only for a split second, just in case it's needed. The <cftransaction isolation="read_uncommitted"> tag removes the un-necessary database checking and locking, which in turn, helps speed things up. This tag is only used with SQL SELECT statements; not to be used with UPDATE, INSERT, or DELETE statements (for obvious reasons).

Compressing strings in ColdFusion using Java

posted: 03 Aug 2008

Have you ever wanted to compress a string to save space and/or bandwidth? Now you can.

StringZip.cfc is used for compressing string variables; not files. For example, information in a text or textarea field, or a variable holding XML, etc.

StringZip.cfc uses the Java String, Deflater, Inflater, and ByteArrayOutputStream classes to create a string zipper/unzipper.

One thing to keep in mind, the zip method of StringZip.cfc returns a Java ByteArray, which you must handle as binary in CF, or image if using SQLServer. If you Base64 encode, you loose much of the benefit of compression (Base64 increases size by roughly 33%).

Also, do not zip any text you may need to search, index, etc. If it's in the database, it'll be in an image type field.

StringZip would work best for single record retrieval operations. If you are querying and returning dozens of records for display, it would not make sense to zip the "high-traffic" data; as you would be constantly unzipping strings.

Performance appears to be blazingly fast. And compression looks good too. Of course, the amount of compression will vary depending on the amount, type, and contents of the data to be compressed. In simple testing of 5k XML strings, the resulting compressed strings were about 10% - 15% the size of the orginal string.

Using StringZip.cfc is extremely easy:


someDataString = "This is a test of the string compressor. The more the data, the better the results.";
zipUtil = createObject("component","StringZip").init();
compressedData = zipUtil.zip(someDataString);
uncompressedData = zipUtil.unzip(compressedData);


To save compressed data to the database in SQLServer, make sure the column type is IMAGE, and use the cfqueryparam tag with type CF_SQL_BLOB.

To retrieve the data from the DB and uncompress it, run a select statement to return the data, and then run the unzip method, passing queryName.ColumnName to the method.

You can download the ColdFusion Java CFC StringZip.cfc here.

You can find the entire list here ColdFusion Java CFC programs.

A new site built using ColdFusion

posted: 03 Aug 2008

I spend quite a bit of time scouring the web for news. It gets a little tiring visiting site after site that is overly burdened with graphics. And then finding the news you�"re looking for is not always an easy task.

I got tired of it. Tired of installing RSS feed readers on my PC and trying to find URLs to XML files that contained good content. So I broke down and created SyndicatePig.com.

www.syndicatepig.com contains a compilation of hundreds of news feeds, constantly updated, to give you a one-stop-shop for your news. And with no graphics!

But not like the other news aggregators. This site is easy to use and understand.

And the site was done using 100% ColdFusion MX 7. OK, there might be a bit of JavaScript, a touch of Ajax, etc, but you know what I mean.

I wanted to leave the site in a somewhat plane-Jane state; after all, we�"re looking for news. And to stay with the news theme, the site is almost all back/gray/white. I even use the courier font.

Just pick a category, and select one or more news providers. You'll be presented with a page that contains all of the news you selected. All uniformly formatted for easy reading, and no graphics. You can quickly peruse the news, and key in on only those stories that interest you.

Check it out at http://www.syndicatepig.com.