CreateUUID() : Friendly Function or Server Killer?
Ok, so I know what you’re thinking. You’re thinking I must be crazy for suggesting that a simple built-in unique randomization function could somehow be instrumental in crashing a ColdFusion server, right? Well, before you go calling me a lunatic, let me assure you that the ColdFusion createUUID() function DOES, in fact, represent a threat to the stability of your server – and I intend to prove it to you.
Before I go too far into this topic, I should point out that this threat really only exists if you happen to be running ColdFusion on a Windows server. For those of you who are running on any sort of *NIX-based server, you can move on – nothing to see here.
If you’re still with me at this point, then I need to do a little explaining.
First off, you should be aware of a known issue with ColdFusion on Windows servers as documented in this Adobe TechNote article. This TechNote basically explains that there’s a bug with certain JVM versions under the hood of ColdFusion that results in the potential for the createUUID() function to actually increase the speed of the system clock within Windows. Now, this acceleration of the Windows system clock is only likely to occur in situations where createUUID() is “heavily utilized”. The tricky part here is that I’m not entirely certain how “heavily utilized” is classified as referenced in the TechNote. Later in this article, however, I will give you an example of a particular customer scenario that was creating this server-crashing behavior as a point of reference. For now, just note that heavy usage of the createUUID() function might cause the Windows system clock to speed ahead a matter of seconds or even minutes within the span of an hour or two depending the volume with which the function calls are made.
Now, the second vital bit of of information comes down to the inner mechanics of how the createUUID() function actually works. The function more or less generates a unique hash/GUID based on a combination of the server machine ID, a cryptographically strong random number, and the present date and time. Now, we all know that calls to the createUUID() function on the same server will all have the same machine ID input value. And even though the chance is extremely slim, fact remains there’s a fraction of a fraction of a chance that two calls to generate a “cryptographically strong random number” could result in returning the same value. That’s where the importance of the present date and time consideration made by createUUID() becomes important. If you think about how quickly many ColdFusion templates execute (quite often literally within milliseconds), you can quickly see how a series of calls to simply get the current server date and time (even if to the precision of milliseconds) might return the exact same date and time. For this very reason, the createUUID() method makes a call during its execution to another internal method – uniqueTOD(). The purpose of uniqueTOD() is to return a Unique Time Of Day – as the name suggests. But as we already discussed, it’s possible that extremely rapid calls to return the system date and time may not be separated by more than a millisecond, resulting in a return of the same date and time. So, in order to ensure that this doesn’t take place, the uniqueTOD() function has been designed as a “synchronized” method. In the Java world, that just means that the method is single-threaded – it can only be run one invocation at a time, and any attempts to invoke it while it’s already in the process of running will be blocked until execution of the running invocation completes. In addition to this, the uniqueTOD() also includes a Thread.sleep(1) statement which effectively causes the thread to sleep for 1 millisecond by designating a specific date and time at which to wake and resume execution. The combination of these two concepts ensures that dates and times returned by the uniqueTOD() function will always be separated by at least 1 millisecond.
If you need proof that this takes place, take the following bit of sample code, toss it on your server, enable server debugging output, point your browser at it, and take a look at the execution time spent.
<cfloop from="1" to="1000" index="i">
<cfset x = createUUID()>
</cfloop>
When you run this code, you’ll quickly notice that the execution time takes roughly 1 second or 1,000 milliseconds in direct correlation with the to argument value of 1,000 for the cfloop. This correlates directly to the 1ms sleep time being incurred with each of the 1,000 calls within the loop to createUUID().
So now that we understand all those mechanics, it’s time for me to unveil where this might become a server killing problem. There’s a handy bit of operating system housekeeping that most of us take completely for granted these days – automatic internet-based time synchronization. Consider the following scenario with me.
- Your server starts off at 1:00:00.000pm with a date and time that are synchronized to an internet time server of some sort.
- Your ColdFusion application makes “heavy utilization” of the createUUID() function over the course of 2 hours, gradually advancing your server clock by 200 seconds or so as a result of the behavior/bugs I’ve explained above.
- Your ColdFusion server is in the middle of a series of high-volume calls to the createUUID() function.
- Your server makes a call to the configured internet time server, realizes the system clock is currently running about 200 seconds fast, and resets the system clock appropriately to 3:00:00.000pm.
- As the system clock was turned back to the proper time, a single internal call to the uniqueTOD() function had already assigned a “wake time” of 3:03:20.001pm (based on having previously been 200 seconds ahead).
In the scenario outlined above, you now have a createUUID() call waiting on an internal uniqueTOD() call that won’t actually complete execution until 3:03:20.001pm. Prior to the call to synchronize the system clock with the internet time server, this would have only been 1 millisecond away. But now that the server clock has been corrected, we’re suddenly 3 minutes, 20 seconds, and 1 millisecond away from that “wake time”. And what makes matters worse is that the waiting uniqueTOD() invocation is also synchronized. This means that all other invocations to uniqueTOD() or to any other functions that include the use of uniqueTOD() must now sit and wait until the running/waiting invocation “wakes” and completes at 3:03:20.001pm.
Wait a minute. What’s that I hear in the background? Oh, yes… that’s the sound of thread activity grinding to a halt on your production server as threads are suddenly being blocked for 3 minutes, 20 seconds and 1 millisecond until that uniqueTOD() invocation finally wakes and completes in order to generate that UUID that was requested.
I actually tracked this situation down for a Webapper customer who had been experiencing frequent yet unpredictable “mystery crashes” of their ColdFusion instances. Our customer was very technically proficient and had already looked under most of the “rocks” we would typically turn over in this sort of situation. During our initial kick-off call with the customer, we were all in consensus that this was likely a result of some sort of database transaction locking and blocking that we’d need to track down. After a little investigation and profiling on the servers in question, I realized that was simply a red herring. With the help of SeeFusion, we were able to snag stack traces from the server as the crashes were actually taking place and quickly determine that the entire system was hanging up WAY before ever getting to any database activity. In fact, the application seemed to be hung on the first line of their Application.cfm file – the <cfapplication> tag.
That’s when it hit me. The customer’s application was architected in such a way that there were very high-volume rapid-succession web service calls being made to a ColdFusion component. By high-volume rapid-succession, I specifically mean 2 or 3 calls every 50ms. That component, by design, was including the root Application.cfm file. And the <cfapplication> call was hanging because the clientmanagement argument had been set to “true” and the customer had enabled the “Use UUID for CFTOKEN” checkbox in the ColdFusion Administrator. Each web service call was resulting in the assignment of a new CFID/CFTOKEN pair – an entirely separate blog-worthy issue. Suddenly, the whole situation made perfect sense – well, at least in theory. The real test, of course, would be to have the customer uncheck the “Use UUID for CFTOKEN” setting and see if the “mystery crashes” came to an end. I’m happy to report that they haven’t had a system hang since disabling that ColdFusion setting.
So, if you have any code in your server that might be creating opportunities for high-volume utilization of the createUUID() function, then you just might be one internet clock synchronization cycle away from a complete ColdFusion meltdown.

Tyson, I can confirm the significance of this. I’ve seen it before myself, and it can be devastating.
Some will say, “well, we don’t use client variables”, but that’s not the point if you leave the client variable repository setting for “disable global client variable updates” unchecked, as it is by default. I won’t elaborate here.
As Tyson says, the subject of the impact of non-browser clients (not just web service calls, but search engine bots, RSS readers, ping tools, scheduled tasks, and more) is worthy of a separate blog entry, and it goes beyond client variables to sessions as well. I don’t know if he was meaning he might write one, but until then, I have one I can share:
Suffering CPU, DB, or memory problems in CF? Spiders could be killing you in ways you’d never dream
http://www.carehart.org/blog/client/index.cfm/2006/10/4/bots_and_spiders_and_poor_CF_performance
I also did a talk on the topic at the CF meetup just a couple of weeks ago, and the recording is here:
Sessions and Clients and Crashes, Oh My!
http://www.carehart.org/presentations/#sessclicrash
In both, I discuss the performance impact of wildly unexpected creation/update of client repository records in the database or registry, and the related memory impact of unexpected creation of sessions.
I have to admit I didn’t think to mention this point about UUIDs, which is a shame as I’ve been burned by it before. It’s a really good point and I took note to mention it next time I give the talk (and will point to your blog entry, in thanks).
BTW, I’d welcome having you or any Webappers on to speak at the CF meetup (coldfusionmeetup.com) some time. You guys always have good stuff to share. Thanks.
Comment by charlie arehart — May 5, 2009 @ 10:41 am
what version of the jvm were they running on their server? the one that was shipped with cf8 isn’t all that good. i would have them update the jvm and see if these problems still exist.
Comment by tony petruzzi — May 5, 2009 @ 12:22 pm
Tyson, thanks for posting all these details.
I do have to ask why not update the JVM, rather than disable CreateUUID for CFToken? According to the Adobe technote, this bug is only present in 1.3 and 1.4 JVM’s.
I just don’t want everyone to go and disable CreateUUID for CFToken because of this. Without that setting you will have terrible session entropy, which makes session hijacking easier.
Comment by Pete Freitag — May 5, 2009 @ 12:27 pm
@Charlie: Thanks for the feedback. I actually didn’t realize you had touched on these topics before. Shows you just how infrequently I dip my head into the blogosphere. But, I’m glad to hear validating confirmation from another highly-respected resource that this little combination of elements has caused troubling production problems. Your meetup presentation is spot on. Now, we just need to talk to you about what it’s going to take to get you mentioning SeeFusion instead of or in addition to mentioning FR when pointing out great ColdFusion server analytics tools. =)
@Tony & @Pete: Great points about the JVM. I did a fair amount of research on Adobe’s and Sun’s perspectives on this since there are claims that newer updated JVMs address this problem. However, take a look at this.
java version "1.6.0_13"
java(TM) SE Runtime Environment (build 1.6.0_13-b03)
Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode, sharing)
That’s from the customer server on which this problem was originally taking place and was ultimately resolved. This was, in fact, happening on a newer JVM than 1.3 or 1.4. In addition to that startling fact, the reality I see with many customers is that they are powerless to approach the problem from this angle anyways. For clients in shared hosting arrangements, they are often tethered to whatever JVM version ColdFusion happens to be coupled with. While I’d like to think that hosting providers are up-to-speed on these sorts of updates and optimizations, I have to assume based on my first-hand experiences that plenty of customers out there are working with mass-produced “default” ColdFusion installations that would suffer from this condition.
Comment by Tyson Vanek — May 5, 2009 @ 1:14 pm
Hey Tyson, thanks for the kind regards. As for my talk, you say, “we just need to talk to you about what it’s going to take to get you mentioning SeeFusion instead of or in addition to mentioning FR when pointing out great ColdFusion server analytics tools.”
I’m disappointed to hear that.
I try always to mention both (and the CF8 monitor) whenever I introduce either. There’s clearly a place for each.
Indeed, on slide 12 of my talk I do just that. Did you miss it, perhaps?
I’d just be bummed if you guys haven’t noticed that I always mention both whenever I introduce CF monitoring (whether in a talk, blog entry, or article). I really try hard to spread that love evenly. They’re all great products, each with their own advantages.
Comment by charlie arehart — May 5, 2009 @ 1:57 pm
@Charlie: Well, I was mostly kidding about the SeeFusion references. Admittedly I did not watch the entire presentation beginning to end – though I did watch most of it and jump around a little. I clearly missed the slide 12 reference to SeeFusion. I just caught a spoken reference somewhere near the middle with regard to logging the client variables queries using FusionReactor. Bottom line is, you’re right – they’re all great products and are a present-day necessity in troubleshooting these sorts of deeper architecture issues that unknowingly plague so many ColdFusion applications and architectures. Who’d have thought back in the days of ColdFusion 1.0 (yes, I was working with it way back then) that ColdFusion would evolve from its origin as a glorified “power to the people” CGI process into such a powerful and complex enterprise-grade application server? =)
Comment by Tyson Vanek — May 5, 2009 @ 2:08 pm
Great discussion and always important to note as createUUID() gets more popular to use. Just a few points to make. First, that createUUID() is not truly random in a mathematical sense, nor was it design to be. It’s design to always be unique. You pretty much showed that in the way UUID is constructed. Just wanted to make sure that point was also made here. Second, CreateUUID is a great function to use and certainly vital in use with CFToken. Developer(s) need to use it. At the same time, this is a great example shows how things can easily go wrong if the developer(s) aren’t sensitive to its nature. Unchecking UUID for cftoken wouldn’t be my first choice here for the reasons Pete stated. It fixed your problem, but at a hugh cost in possible session hijacking. The design of the webservice calls to CFCs with the client management in the way would need to be alter instead. Not fun for the developer(s), but it’s the right way to go here.
Comment by John Mason — May 5, 2009 @ 4:01 pm
@John: Thanks for the comment. Yes, it’s good to point out that the result of a createUUID() call is more of a calculated unique value than a truly random value. And yes, I agree that integer-based CFTOKEN values pose a threat in the category of client/session hijacking. I should point out that unchecking the “Use UUID for CFTOKEN” option was just our first pass at solving this issue. I did, in fact, have a follow-up call with the customer and discuss strategies for re-structuring both their load-balancer HTTP probe calls and web service invoked CFCs in such a way that they did not incorporate the larger client/session reliant application context. And the official plan is to re-enable the “Use UUID for CFTOKEN” option once they’ve restructured their application code as we’ve suggested.
Comment by Tyson Vanek — May 5, 2009 @ 4:57 pm
@Tyson, thanks and no worries. And yes, it’s a great point we should draw out, that the monitors (all three) can monitor the DSN to track queries (read/update/insert/delete) against any DSN set to be client variable repository. It’s another great way to see the impact of the issues we’re talking about.
Separately, are you guys aware that when we submit this form we’re taken to a blank page? Is that intentional? It just makes it appear that the comment didn’t take, and it seems that you’re doing comment moderation as a refresh of the entry doesn’t show such a submission showing up immediately. Even if it needs to be a new page, how about at least saying “we got your comment. It will not await approval”, rather than a blank page? Somehow I think this blank page is unexpected behavior. Or do you see it, too?
Finally, I’m not getting email notifications of new comments. Is that to be expected? I realize you have no box offering it as an option, but since the form requires an email address, it seemed we should expect notification by default. Has this concern been raised and addressed elsewhere, perhaps in the comments of another entry, if you guys may not want to revisit the discussion here? Yes, I do see the RSS feed option. I just am surprised by the lack of an email option and wanted to ask if the lack of an email notification is indeed intentional.
Comment by charlie arehart — May 5, 2009 @ 5:20 pm
Just an update to something I wrote before: I referred to a”’slide 12″ where I said I mentioned both FR and SF. My bad: I was referring to the wrong talk (my CF911 talk). In this “Clients and Sessions an d Crashes: Oh My” talk, I didn’t have much content on the slides and did most of my talking off the cuff. I may well still have referred to FR at some point but I do think/hope that when I first mentioned monitors I’d have mentioned all three, unless I was speaking of something I thought was unique to one of them.
Comment by charlie arehart — May 5, 2009 @ 5:27 pm
Tyson: thanks for a great post! Quick question: the TechNote you reference mentions a JVM switch recommended by Sun as a fix; did you (or Charlie, given that he indicated he had burned by this, too) try that fix as a possible means of resolving this? Looking at the TechNote and the Sun bug report referenced within the TechNote, it seems like this is present for at least some people in fairly broad range of JVM versions, too.
Comment by Ron Stewart — May 6, 2009 @ 6:47 am
@Ron: The Adobe TechNote references this behavior as being specific to 1.3.x and 1.4.x JVMs. However, I encountered this problem with a customer on a new install of CFMX 8 on JVM 1.6.0_13. This sort of makes sense to me since the original JVM bug 4500388 is listed with a status of “Fix Delivered”, but a subsequent JVM bug 6435126 which I believe still contributes to this behavior is listed with a status of “Cause Known”. The customer that I mentioned in the article did, in fact, nod their head in acknowledgement that they had seen their system clock running a bit fast on the server in question – a direct result of createUUID() being called on a near continual basis in their application with a frequency of twice very 50ms.
As far as I’m concerned, this problem is not specific to any single JVM version and should be planned for and treated as such.
Comment by Tyson Vanek — May 8, 2009 @ 3:40 pm
This is great information to have. Bookmarking for sure.
Thanks!
Comment by Andy Sandefer — May 13, 2009 @ 4:21 pm
Nice catch Tyson. Talking about coming back to bite you. I believe I wrote that technote back in 2006 after a customer pointed it out to me during a PA&T (you guys remember those?) This thing gets even more complicated when you use time synch software. We did the same loop test and literally watched the second hand spin with each iteration.
Long story short, the technote is really old and hasn’t been updated. The new technote system doesn’t seem to display the product version but this was back in the 6.0.x days. Since I don’t support CF anymore I hadn’t looked into this with CF8 but looks like I’ll have to get the team on it. Ron’s question still remains — has anyone tried the XX:+ForceTimeHighResolution JVM flag to prevent this with their chosen Sun JVM version & CF8?
Comment by Sarge — May 28, 2009 @ 10:53 am
@Sarge: Hey man, great to see you getting involved. =) Do I remember PA&T? Geesh, that’s a silly question. Aside from the fact that I spent many a week sleeping on temperature controlled server room floors during weeks on site with Allaire customers, I’m actually still performing similar sorts of “PA&T” engagements for customers in my present day work. As we all know, it’s one thing to develop and write a good ColdFusion application; another thing entirely to engineer for stability, scalability, concurrency, etc.
But, back to your question. I have not, in fact, tested with the XX:+ForceTimeHighResolution JVM flag enabled. In looking at the Sun bug detail cited by that Adobe TechNote, it seems there’s an open related issue to that original bug that was thought to be resolved. The open related issue seems to indicate that the behavior is still a problem.
Fact is, I sort of had limited accessibility to the customer environment in which we originally diagnosed this issue so I wasn’t able to go so far as altering the JVM config and experiment with adding the flag. And in my own local environments, I’m not even running ColdFusion on a Windows platform anymore. I’ve happily followed in the footsteps of many before me who have made the conversion over to Mac. I’m still running a Windows XP virtual instance that I suppose I could leverage for some testing. But my primary development configuration of ColdFusion is installed within my Mac OS.
Keep me apprised if you or someone you know manage to produce evidence that the JVM flag resolves this issue under the same configuration. I’ll be sure to update the blog posting appropriately with such finds.
Comment by Tyson Vanek — June 4, 2009 @ 2:56 pm